Umeå universitets logga

umu.sePublikationer
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Publikationer (3 of 3) Visa alla publikationer
Devinney, H., Björklund, J. & Björklund, H. (2022). Theories of gender in natural language processing. In: Proceedings of the fifth annual ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT'22): . Paper presented at ACM FAccT Conference 2022, Conference on Fairness, Accountability, and Transparency, Hybrid via Seoul, Soth Korea, June 21-14, 2022 (pp. 2083-2102). Association for Computing Machinery (ACM)
Öppna denna publikation i ny flik eller fönster >>Theories of gender in natural language processing
2022 (Engelska)Ingår i: Proceedings of the fifth annual ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT'22), Association for Computing Machinery (ACM), 2022, s. 2083-2102Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

The rise of concern around Natural Language Processing (NLP) technologies containing and perpetuating social biases has led to a rich and rapidly growing area of research. Gender bias is one of the central biases being analyzed, but to date there is no comprehensive analysis of how “gender” is theorized in the field. We survey nearly 200 articles concerning gender bias in NLP to discover how the field conceptualizes gender both explicitly (e.g. through definitions of terms) and implicitly (e.g. through how gender is operationalized in practice). In order to get a better idea of emerging trajectories of thought, we split these articles into two sections by time.

We find that the majority of the articles do not make their theo- rization of gender explicit, even if they clearly define “bias.” Almost none use a model of gender that is intersectional or inclusive of non- binary genders; and many conflate sex characteristics, social gender, and linguistic gender in ways that disregard the existence and expe- rience of trans, nonbinary, and intersex people. There is an increase between the two time-sections in statements acknowledging that gender is a complicated reality, however, very few articles manage to put this acknowledgment into practice. In addition to analyzing these findings, we provide specific recommendations to facilitate interdisciplinary work, and to incorporate theory and methodol- ogy from Gender Studies. Our hope is that this will produce more inclusive gender bias research in NLP.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM), 2022
Nyckelord
natural language processing, gender bias, gender studies
Nationell ämneskategori
Språkbehandling och datorlingvistik Genusstudier
Forskningsämne
datalogi; genusvetenskap
Identifikatorer
urn:nbn:se:umu:diva-194742 (URN)10.1145/3531146.3534627 (DOI)2-s2.0-85133018925 (Scopus ID)9781450393522 (ISBN)
Konferens
ACM FAccT Conference 2022, Conference on Fairness, Accountability, and Transparency, Hybrid via Seoul, Soth Korea, June 21-14, 2022
Anmärkning

Alternative title: "Theories of 'Gender' in NLP Bias Research"

Tillgänglig från: 2022-05-16 Skapad: 2022-05-16 Senast uppdaterad: 2025-02-01Bibliografiskt granskad
Devinney, H., Björklund, J. & Björklund, H. (2020). Crime and Relationship: Exploring Gender Bias in NLP Corpora. In: : . Paper presented at SLTC 2020 – The Eighth Swedish Language Technology Conference, 25–27 November 2020, Online.
Öppna denna publikation i ny flik eller fönster >>Crime and Relationship: Exploring Gender Bias in NLP Corpora
2020 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Gender bias in natural language processing (NLP) tools, deriving from implicit human bias embedded in language data, is an important and complicated problem on the road to fair algorithms. We leverage topic modeling to retrieve documents associated with particular gendered categories, and discuss how exploring these documents can inform our understanding of the corpora we may use to train NLP tools. This is a starting point for challenging the systemic power structures and producing a justice-focused approach to NLP.

Nyckelord
gender bias, topic modeling
Nationell ämneskategori
Språkbehandling och datorlingvistik Genusstudier
Forskningsämne
datalogi; genusvetenskap
Identifikatorer
urn:nbn:se:umu:diva-177583 (URN)
Konferens
SLTC 2020 – The Eighth Swedish Language Technology Conference, 25–27 November 2020, Online
Projekt
EQUITBL
Tillgänglig från: 2020-12-14 Skapad: 2020-12-14 Senast uppdaterad: 2025-02-01Bibliografiskt granskad
Devinney, H., Björklund, J. & Björklund, H. (2020). Semi-Supervised Topic Modeling for Gender Bias Discovery in English and Swedish. In: Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster (Ed.), Proceedings of the Second Workshop on Gender Bias in Natural Language Processing: . Paper presented at GeBNLP2020, COLING'2020 – The 28th International Conference on Computational Linguistics, December 8-13, 2020, Online (pp. 79-92). Association for Computational Linguistics
Öppna denna publikation i ny flik eller fönster >>Semi-Supervised Topic Modeling for Gender Bias Discovery in English and Swedish
2020 (Engelska)Ingår i: Proceedings of the Second Workshop on Gender Bias in Natural Language Processing / [ed] Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster, Association for Computational Linguistics, 2020, s. 79-92Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Gender bias has been identified in many models for Natural Language Processing, stemming from implicit biases in the text corpora used to train the models. Such corpora are too large to closely analyze for biased or stereotypical content. Thus, we argue for a combination of quantitative and qualitative methods, where the quantitative part produces a view of the data of a size suitable for qualitative analysis. We investigate the usefulness of semi-supervised topic modeling for the detection and analysis of gender bias in three corpora (mainstream news articles in English and Swedish, and LGBTQ+ web content in English). We compare differences in topic models for three gender categories (masculine, feminine, and nonbinary or neutral) in each corpus. We find that in all corpora, genders are treated differently and that these differences tend to correspond to hegemonic ideas of gender.

Ort, förlag, år, upplaga, sidor
Association for Computational Linguistics, 2020
Nyckelord
gender bias, topic modelling
Nationell ämneskategori
Språkbehandling och datorlingvistik Genusstudier
Forskningsämne
datalogi; genusvetenskap
Identifikatorer
urn:nbn:se:umu:diva-177576 (URN)
Konferens
GeBNLP2020, COLING'2020 – The 28th International Conference on Computational Linguistics, December 8-13, 2020, Online
Projekt
EQUITBL
Tillgänglig från: 2020-12-14 Skapad: 2020-12-14 Senast uppdaterad: 2025-02-01Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-4954-4397

Sök vidare i DiVA

Visa alla publikationer