Umeå University's logo

umu.sePublikasjoner
Endre søk
Link to record
Permanent link

Direct link
Publikasjoner (10 av 41) Visa alla publikasjoner
Berglund, M., Björklund, H. & Björklund, J. (2024). Parsing unranked tree languages, folded once. Algorithms, 17(6), Article ID 268.
Åpne denne publikasjonen i ny fane eller vindu >>Parsing unranked tree languages, folded once
2024 (engelsk)Inngår i: Algorithms, E-ISSN 1999-4893, Vol. 17, nr 6, artikkel-id 268Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

A regular unranked tree folding consists of a regular unranked tree language and a folding operation that merges (i.e., folds) selected nodes of a tree to form a graph; the combination is a formal device for representing graph languages. If, in the process of folding, the order among edges is discarded so that the result is an unordered graph, then two applications of a fold operation are enough to make the associated parsing problem NP-complete. However, if the order is kept, then the problem is solvable in non-uniform polynomial time. In this paper, we address the remaining case, where only one fold operation is applied, but the order among the edges is discarded. We show that, under these conditions, the problem is solvable in non-uniform polynomial time.

sted, utgiver, år, opplag, sider
MDPI, 2024
Emneord
graphs, transducers, trees, vector addition systems
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-227569 (URN)10.3390/a17060268 (DOI)2-s2.0-85196886791 (Scopus ID)
Forskningsfinansiär
Swedish Research Council, 2020-03852Wallenberg AI, Autonomous Systems and Software Program (WASP)Knut and Alice Wallenberg Foundation
Merknad

This paper is an extended version of a paper published in International Symposium on Fundamentals of Computation Theory, Trier, Germany, 18–21 September.

Tilgjengelig fra: 2024-07-02 Laget: 2024-07-02 Sist oppdatert: 2024-07-02bibliografisk kontrollert
Björklund, H., Björklund, J. & Ericson, P. (2024). Tree-based generation of restricted graph languages. International Journal of Foundations of Computer Science, 35(1 & 2), 215-243
Åpne denne publikasjonen i ny fane eller vindu >>Tree-based generation of restricted graph languages
2024 (engelsk)Inngår i: International Journal of Foundations of Computer Science, ISSN 0129-0541, Vol. 35, nr 1 & 2, s. 215-243Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Order-preserving DAG grammars (OPDGs) is a formalism for representing languages of structurally restricted graphs. As demonstrated in [17], they are sufficiently expressive to model abstract meaning representations in natural language processing, a graph-based form of semantic representation in which nodes encode objects and edges relations. At the same time, they can be parsed in O (n2 + nm) , where m and n are the sizes of the grammar and the input graph, respectively. In this work, we provide an initial algebra semantic for OPDGs, which allows us to view them as regular tree grammars under an equivalence theory. This makes it possible to transfer results from the field of formal tree languages to the domain of OPDGs, both in the unweighted and the weighted case. In particular, we show that deterministic OPDGs can be minimised efficiently, and that they are learnable under the \minimal adequeate teacher" paradigm, that is, by querying an oracle for equivalence between languages, and membership of individual graphs. To conclude, we demonstrate that the languages generated by OPDGs are definable in monadic second-order logic.

sted, utgiver, år, opplag, sider
World Scientific, 2024
Emneord
Graph languages, logic characterisation, MAT learning, minimization
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-217981 (URN)10.1142/S0129054123480106 (DOI)001109806500001 ()2-s2.0-85178101785 (Scopus ID)
Forskningsfinansiär
Swedish Research Council, 2020-03852Wallenberg AI, Autonomous Systems and Software Program (WASP), Nest project Sting
Tilgjengelig fra: 2023-12-15 Laget: 2023-12-15 Sist oppdatert: 2024-05-14bibliografisk kontrollert
Devinney, H., Björklund, J. & Björklund, H. (2024). We don’t talk about that: case studies on intersectional analysis of social bias in large language models. In: Agnieszka Faleńska; Christine Basta; Marta Costa-jussà; Seraphina Goldfarb-Tarrant; Debora Nozza (Ed.), Proceedings of the 5th workshop on gender bias in natural language processing (GeBNLP): . Paper presented at Workshop on Gender Bias in Natural Language Processing (GeBNLP), Bangkok, Thailand, 16th August, 2024. (pp. 33-44). Association for Computational Linguistics
Åpne denne publikasjonen i ny fane eller vindu >>We don’t talk about that: case studies on intersectional analysis of social bias in large language models
2024 (engelsk)Inngår i: Proceedings of the 5th workshop on gender bias in natural language processing (GeBNLP) / [ed] Agnieszka Faleńska; Christine Basta; Marta Costa-jussà; Seraphina Goldfarb-Tarrant; Debora Nozza, Association for Computational Linguistics, 2024, s. 33-44Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Despite concerns that Large Language Models (LLMs) are vectors for reproducing and ampli- fying social biases such as sexism, transpho- bia, islamophobia, and racism, there is a lack of work qualitatively analyzing how such pat- terns of bias are generated by LLMs. We use mixed-methods approaches and apply a femi- nist, intersectional lens to the problem across two language domains, Swedish and English, by generating narrative texts using LLMs. We find that hegemonic norms are consistently re- produced; dominant identities are often treated as ‘default’; and discussion of identity itself may be considered ‘inappropriate’ by the safety features applied to some LLMs. Due to the dif- fering behaviors of models, depending both on their design and the language they are trained on, we observe that strategies of identifying “bias” must be adapted to individual models and their socio-cultural contexts.

sted, utgiver, år, opplag, sider
Association for Computational Linguistics, 2024
HSV kategori
Forskningsprogram
datorlingvistik
Identifikatorer
urn:nbn:se:umu:diva-228891 (URN)979-8-89176-137-7 (ISBN)
Konferanse
Workshop on Gender Bias in Natural Language Processing (GeBNLP), Bangkok, Thailand, 16th August, 2024.
Tilgjengelig fra: 2024-08-29 Laget: 2024-08-29 Sist oppdatert: 2024-08-29bibliografisk kontrollert
Björklund, H. & Devinney, H. (2023). Computer, enhence: POS-tagging improvements for nonbinary pronoun use in Swedish. In: Proceedings of the third workshop on language technology for equality, diversity, inclusion: . Paper presented at Third Workshop on Language Technology for Equality, Diversity, Inclusion (LT-EDI-2023) at RANLP 2023, Varna, Bulgaria, September 7, 2023 (pp. 54-61). The Association for Computational Linguistics
Åpne denne publikasjonen i ny fane eller vindu >>Computer, enhence: POS-tagging improvements for nonbinary pronoun use in Swedish
2023 (engelsk)Inngår i: Proceedings of the third workshop on language technology for equality, diversity, inclusion, The Association for Computational Linguistics , 2023, s. 54-61Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Part of Speech (POS) taggers for Swedish routinely fail for the third person gender-neutral pronoun hen, despite the fact that it has been a well-established part of the Swedish language since at least 2014. In addition to simply being a form of gender bias, this failure can have negative effects on other tasks relying on POS information. We demonstrate the usefulness of semi-synthetic augmented datasets in a case study, retraining a POS tagger to correctly recognize hen as a personal pronoun. We evaluate our retrained models for both tag accuracy and on a downstream task (dependency parsing) in a classicial NLP pipeline.

Our results show that adding such data works to correct for the disparity in performance. The accuracy rate for identifying hen as a pronoun can be brought up to acceptable levels with only minor adjustments to the tagger’s vocabulary files. Performance parity to gendered pronouns can be reached after retraining with only a few hundred examples. This increase in POS tag accuracy also results in improvements for dependency parsing sentences containing hen.

sted, utgiver, år, opplag, sider
The Association for Computational Linguistics, 2023
Emneord
Part-of-Speech, gendered pronouns, neopronouns
HSV kategori
Forskningsprogram
datorlingvistik
Identifikatorer
urn:nbn:se:umu:diva-213782 (URN)10.26615/978-954-452-084-7_008 (DOI)2-s2.0-85184990283 (Scopus ID)978-954-452-084-7 (ISBN)
Konferanse
Third Workshop on Language Technology for Equality, Diversity, Inclusion (LT-EDI-2023) at RANLP 2023, Varna, Bulgaria, September 7, 2023
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Tilgjengelig fra: 2023-09-26 Laget: 2023-09-26 Sist oppdatert: 2024-02-27bibliografisk kontrollert
Berglund, M., Björklund, H. & Björklund, J. (2023). Parsing unranked tree languages, folded once. In: Henning Fernau; Klaus Jansen (Ed.), Fundamentals of computation theory: 24th International Symposium, FCT 2023, Trier, Germany, September 18–21, 2023, Proceedings. Paper presented at 24th International Symposium on Fundamentals of Computation Theory, FCT 2023, Trier, Germany, September 18–21, 2023 (pp. 60-73). Springer Nature
Åpne denne publikasjonen i ny fane eller vindu >>Parsing unranked tree languages, folded once
2023 (engelsk)Inngår i: Fundamentals of computation theory: 24th International Symposium, FCT 2023, Trier, Germany, September 18–21, 2023, Proceedings / [ed] Henning Fernau; Klaus Jansen, Springer Nature, 2023, s. 60-73Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

A regular unranked tree folding consists of a regular unranked tree language and a folding operation that merges, i.e., folds, selected nodes of a tree to form a graph; the combination is a formal device for representing graph languages. If, in the process of folding, the order among edges is discarded so that the result is an unordered graph, then two applications of a fold operation is enough to make the associated parsing problem NP-complete. However, if the order is kept, then the problem is solvable in non-uniform polynomial time. In this paper we address the remaining case where only one fold operation is applied, but the order among edges is discarded. We show that under these conditions, the problem is solvable in non-uniform polynomial time.

sted, utgiver, år, opplag, sider
Springer Nature, 2023
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14292
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-215936 (URN)10.1007/978-3-031-43587-4_5 (DOI)2-s2.0-85174590997 (Scopus ID)9783031435867 (ISBN)
Konferanse
24th International Symposium on Fundamentals of Computation Theory, FCT 2023, Trier, Germany, September 18–21, 2023
Tilgjengelig fra: 2023-11-02 Laget: 2023-11-02 Sist oppdatert: 2023-11-02bibliografisk kontrollert
Berglund, M., Björklund, H., Björklund, J. & Boiret, A. (2023). Transduction from trees to graphs through folding. Information and Computation, 295, Article ID 105111.
Åpne denne publikasjonen i ny fane eller vindu >>Transduction from trees to graphs through folding
2023 (engelsk)Inngår i: Information and Computation, ISSN 0890-5401, E-ISSN 1090-2651, Vol. 295, artikkel-id 105111Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

We introduce a fold operation that realises a tree-to-graph transduction by merging selected nodes in the input tree to form a possibly cyclic output graph. The work is motivated by the increasing use of graph-based representations in semantic parsing. We show that a suitable class of graphs languages can be generated by applying the fold operation to regular unranked tree languages. We investigate two versions of the fold operation, one that preserves a depth-first ordering between the edges, and one that does not. Finally, we demonstrate that the time complexity for the associated non-uniform membership problem is solvable in polynomial time for the order-preserving version, and NP-complete for the order-cancelling one.

sted, utgiver, år, opplag, sider
Elsevier, 2023
Emneord
Graphs, Semantic representations, Tranducers, Trees
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-216195 (URN)10.1016/j.ic.2023.105111 (DOI)2-s2.0-85175145940 (Scopus ID)
Forskningsfinansiär
Swedish Research Council, 2020-03852
Tilgjengelig fra: 2023-11-08 Laget: 2023-11-08 Sist oppdatert: 2023-11-08bibliografisk kontrollert
Björklund, H. & Devinney, H. (2022). Improving Swedish part-of-speech tagging for hen. In: : . Paper presented at Swedish Language Technology Conference 2022, Stockholm, Sweden, November 23-25, 2022.
Åpne denne publikasjonen i ny fane eller vindu >>Improving Swedish part-of-speech tagging for hen
2022 (engelsk)Konferansepaper, Oral presentation only (Fagfellevurdert)
Abstract [en]

Despite the fact that the gender-neutral pro-noun hen was officially added to the Swedish language in 2014, state of the art part of speech taggers still routinely fail to identify it as a pronoun. We retrain both efselab and spaCy models with augmented (semi-synthetic) data, where instances of gendered pronouns are replaced by hen to correct for the lack of representation in the original training data. Our results show that adding such data works to correct for the disparity in performance

Emneord
Part-of-Speech, gendered pronouns, neopronouns
HSV kategori
Forskningsprogram
datorlingvistik
Identifikatorer
urn:nbn:se:umu:diva-201268 (URN)
Konferanse
Swedish Language Technology Conference 2022, Stockholm, Sweden, November 23-25, 2022
Tilgjengelig fra: 2022-11-24 Laget: 2022-11-24 Sist oppdatert: 2022-11-28bibliografisk kontrollert
Devinney, H., Björklund, J. & Björklund, H. (2022). Theories of gender in natural language processing. In: Proceedings of the fifth annual ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT'22): . Paper presented at ACM FAccT Conference 2022, Conference on Fairness, Accountability, and Transparency, Hybrid via Seoul, Soth Korea, June 21-14, 2022 (pp. 2083-2102). Association for Computing Machinery (ACM)
Åpne denne publikasjonen i ny fane eller vindu >>Theories of gender in natural language processing
2022 (engelsk)Inngår i: Proceedings of the fifth annual ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT'22), Association for Computing Machinery (ACM), 2022, s. 2083-2102Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

The rise of concern around Natural Language Processing (NLP) technologies containing and perpetuating social biases has led to a rich and rapidly growing area of research. Gender bias is one of the central biases being analyzed, but to date there is no comprehensive analysis of how “gender” is theorized in the field. We survey nearly 200 articles concerning gender bias in NLP to discover how the field conceptualizes gender both explicitly (e.g. through definitions of terms) and implicitly (e.g. through how gender is operationalized in practice). In order to get a better idea of emerging trajectories of thought, we split these articles into two sections by time.

We find that the majority of the articles do not make their theo- rization of gender explicit, even if they clearly define “bias.” Almost none use a model of gender that is intersectional or inclusive of non- binary genders; and many conflate sex characteristics, social gender, and linguistic gender in ways that disregard the existence and expe- rience of trans, nonbinary, and intersex people. There is an increase between the two time-sections in statements acknowledging that gender is a complicated reality, however, very few articles manage to put this acknowledgment into practice. In addition to analyzing these findings, we provide specific recommendations to facilitate interdisciplinary work, and to incorporate theory and methodol- ogy from Gender Studies. Our hope is that this will produce more inclusive gender bias research in NLP.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2022
Emneord
natural language processing, gender bias, gender studies
HSV kategori
Forskningsprogram
datalogi; genusvetenskap
Identifikatorer
urn:nbn:se:umu:diva-194742 (URN)10.1145/3531146.3534627 (DOI)2-s2.0-85133018925 (Scopus ID)9781450393522 (ISBN)
Konferanse
ACM FAccT Conference 2022, Conference on Fairness, Accountability, and Transparency, Hybrid via Seoul, Soth Korea, June 21-14, 2022
Merknad

Alternative title: "Theories of 'Gender' in NLP Bias Research"

Tilgjengelig fra: 2022-05-16 Laget: 2022-05-16 Sist oppdatert: 2024-08-27bibliografisk kontrollert
Björklund, H., Drewes, F., Ericson, P. & Starke, F. (2021). Uniform Parsing for Hyperedge Replacement Grammars. Journal of computer and system sciences (Print), 118, 1-27
Åpne denne publikasjonen i ny fane eller vindu >>Uniform Parsing for Hyperedge Replacement Grammars
2021 (engelsk)Inngår i: Journal of computer and system sciences (Print), ISSN 0022-0000, E-ISSN 1090-2724, Vol. 118, s. 1-27Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

It is well known that hyperedge-replacement grammars can generate NP-complete graph languages even under seemingly harsh restrictions. This means that the parsing problem is difficult even in the non-uniform setting, in which the grammar is considered to be fixed rather than being part of the input. Little is known about restrictions under which truly uniform polynomial parsing is possible. In this paper we propose a low-degree polynomial-time algorithm that solves the uniform parsing problem for a restricted type of hyperedge-replacement grammars which we expect to be of interest for practical applications.

sted, utgiver, år, opplag, sider
Elsevier, 2021
Emneord
parsing, graph language, graph grammar, abstract meaning representation
HSV kategori
Forskningsprogram
datalogi
Identifikatorer
urn:nbn:se:umu:diva-177125 (URN)10.1016/j.jcss.2020.10.002 (DOI)000615930900001 ()2-s2.0-85097717738 (Scopus ID)
Tilgjengelig fra: 2020-11-29 Laget: 2020-11-29 Sist oppdatert: 2023-09-05bibliografisk kontrollert
Devinney, H., Björklund, J. & Björklund, H. (2020). Crime and Relationship: Exploring Gender Bias in NLP Corpora. In: : . Paper presented at SLTC 2020 – The Eighth Swedish Language Technology Conference, 25–27 November 2020, Online.
Åpne denne publikasjonen i ny fane eller vindu >>Crime and Relationship: Exploring Gender Bias in NLP Corpora
2020 (engelsk)Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Gender bias in natural language processing (NLP) tools, deriving from implicit human bias embedded in language data, is an important and complicated problem on the road to fair algorithms. We leverage topic modeling to retrieve documents associated with particular gendered categories, and discuss how exploring these documents can inform our understanding of the corpora we may use to train NLP tools. This is a starting point for challenging the systemic power structures and producing a justice-focused approach to NLP.

Emneord
gender bias, topic modeling
HSV kategori
Forskningsprogram
datalogi; genusvetenskap
Identifikatorer
urn:nbn:se:umu:diva-177583 (URN)
Konferanse
SLTC 2020 – The Eighth Swedish Language Technology Conference, 25–27 November 2020, Online
Prosjekter
EQUITBL
Tilgjengelig fra: 2020-12-14 Laget: 2020-12-14 Sist oppdatert: 2021-01-14bibliografisk kontrollert
Prosjekter
Parametriserad syntaktisk analys för naturliga språk [2011-06080_VR]; Umeå universitet
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-4696-9787