Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving Swedish part-of-speech tagging for hen
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Foundations of Language Processing)ORCID iD: 0000-0002-4696-9787
Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Social Sciences, Umeå Centre for Gender Studies (UCGS). (Foundations of Language Processing)ORCID iD: 0000-0003-0278-9757
2022 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

Despite the fact that the gender-neutral pro-noun hen was officially added to the Swedish language in 2014, state of the art part of speech taggers still routinely fail to identify it as a pronoun. We retrain both efselab and spaCy models with augmented (semi-synthetic) data, where instances of gendered pronouns are replaced by hen to correct for the lack of representation in the original training data. Our results show that adding such data works to correct for the disparity in performance

Place, publisher, year, edition, pages
2022.
Keywords [en]
Part-of-Speech, gendered pronouns, neopronouns
National Category
Language Technology (Computational Linguistics)
Research subject
computational linguistics
Identifiers
URN: urn:nbn:se:umu:diva-201268OAI: oai:DiVA.org:umu-201268DiVA, id: diva2:1713349
Conference
Swedish Language Technology Conference 2022, Stockholm, Sweden, November 23-25, 2022
Available from: 2022-11-24 Created: 2022-11-24 Last updated: 2022-11-28Bibliographically approved

Open Access in DiVA

fulltext(174 kB)93 downloads
File information
File name FULLTEXT01.pdfFile size 174 kBChecksum SHA-512
22a4b377609ef01729f9b4ec1da3127028a290267698fc1d3a652deca00dfa0fc1871501b40d6e2e17a431c6bdf665c241a30023346558f2fc5cc4eccbd81882
Type fulltextMimetype application/pdf

Other links

https://2022.sltc.se/papers/SLTC22_paper_918.pdf

Authority records

Björklund, HenrikDevinney, Hannah

Search in DiVA

By author/editor
Björklund, HenrikDevinney, Hannah
By organisation
Department of Computing ScienceUmeå Centre for Gender Studies (UCGS)
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 93 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 373 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf