Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 14) Show all publications
Aler Tubella, A., Coelho Mollo, D., Dahlgren, A., Devinney, H., Dignum, V., Ericson, P., . . . Nieves, J. C. (2023). ACROCPoLis: a descriptive framework for making sense of fairness. In: FAccT '23: Proceedings of the 2023 ACM conference on fairness, accountability, and transparency. Paper presented at 2023 ACM Conference on Fairness, Accountability, and Transparency, Chicago, Illinois, USA, June 12-15, 2023 (pp. 1014-1025). ACM Digital Library
Open this publication in new window or tab >>ACROCPoLis: a descriptive framework for making sense of fairness
Show others...
2023 (English)In: FAccT '23: Proceedings of the 2023 ACM conference on fairness, accountability, and transparency, ACM Digital Library, 2023, p. 1014-1025Conference paper, Published paper (Refereed)
Abstract [en]

Fairness is central to the ethical and responsible development and use of AI systems, with a large number of frameworks and formal notions of algorithmic fairness being available. However, many of the fairness solutions proposed revolve around technical considerations and not the needs of and consequences for the most impacted communities. We therefore want to take the focus away from definitions and allow for the inclusion of societal and relational aspects to represent how the effects of AI systems impact and are experienced by individuals and social groups. In this paper, we do this by means of proposing the ACROCPoLis framework to represent allocation processes with a modeling emphasis on fairness aspects. The framework provides a shared vocabulary in which the factors relevant to fairness assessments for different situations and procedures are made explicit, as well as their interrelationships. This enables us to compare analogous situations, to highlight the differences in dissimilar situations, and to capture differing interpretations of the same situation by different stakeholders.

Place, publisher, year, edition, pages
ACM Digital Library, 2023
Keywords
Algorithmic fairness; socio-technical processes; social impact of AI; responsible AI
National Category
Information Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-209705 (URN)10.1145/3593013.3594059 (DOI)2-s2.0-85163594710 (Scopus ID)978-1-4503-7252-7 (ISBN)
Conference
2023 ACM Conference on Fairness, Accountability, and Transparency, Chicago, Illinois, USA, June 12-15, 2023
Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2023-07-18Bibliographically approved
Andersson, E., Björklund, J., Drewes, F. & Jonsson, A. (2023). Generating semantic graph corpora with graph expansion grammar. In: Nagy B., Freund R. (Ed.), 13th International Workshop on Non-Classical Models of Automata and Applications (NCMA 2023): . Paper presented at 13th International Workshop on Non-Classical Models of Automata and Applications, NCMA 2023, 18-19 September, 2023, Famagusta, Cyprus (pp. 3-15). Open Publishing Association, 388
Open this publication in new window or tab >>Generating semantic graph corpora with graph expansion grammar
2023 (English)In: 13th International Workshop on Non-Classical Models of Automata and Applications (NCMA 2023) / [ed] Nagy B., Freund R., Open Publishing Association , 2023, Vol. 388, p. 3-15Conference paper, Published paper (Refereed)
Abstract [en]

We introduce LOVELACE, a tool for creating corpora of semantic graphs.The system uses graph expansion grammar as  a representational language, thus allowing users to craft a grammar that describes a corpus with desired properties. When given such grammar as input, the system generates a set of output graphs that are well-formed according to the grammar, i.e., a graph bank.The generation process can be controlled via a number of configurable parameters that allow the user to, for example, specify a range of desired output graph sizes.Central use cases are the creation of synthetic data to augment existing corpora, and as a pedagogical tool for teaching formal language theory. 

Place, publisher, year, edition, pages
Open Publishing Association, 2023
Series
Electronic Proceedings in Theoretical Computer Science, ISSN 2075-2180
Keywords
semantic representation, graph corpora, graph grammar
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-212143 (URN)10.4204/EPTCS.388.3 (DOI)2-s2.0-85173059788 (Scopus ID)
Conference
13th International Workshop on Non-Classical Models of Automata and Applications, NCMA 2023, 18-19 September, 2023, Famagusta, Cyprus
Funder
Swedish Research Council, 2020-03852
Available from: 2023-07-18 Created: 2023-07-18 Last updated: 2023-10-18Bibliographically approved
Björklund, J., Drewes, F. & Jonsson, A. (2023). Generation and polynomial parsing of graph languages with non-structural reentrancies. Computational linguistics - Association for Computational Linguistics (Print), 49(4), 841-882
Open this publication in new window or tab >>Generation and polynomial parsing of graph languages with non-structural reentrancies
2023 (English)In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 49, no 4, p. 841-882Article in journal (Refereed) Published
Abstract [en]

Graph-based semantic representations are popular in natural language processing (NLP), where it is often convenient to model linguistic concepts as nodes and relations as edges between them. Several attempts have been made to find a generative device that is sufficiently powerful to describe languages of semantic graphs, while at the same allowing efficient parsing. We contribute to this line of work by introducing graph extension grammar, a variant of the contextual hyperedge replacement grammars proposed by Hoffmann et al. Contextual hyperedge replacement can generate graphs with non-structural reentrancies, a type of node-sharing that is very common in formalisms such as abstract meaning representation, but which context-free types of graph grammars cannot model. To provide our formalism with a way to place reentrancies in a linguistically meaningful way, we endow rules with logical formulas in counting monadic second-order logic. We then present a parsing algorithm and show as our main result that this algorithm runs in polynomial time on graph languages generated by a subclass of our grammars, the so-called local graph extension grammars.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2023
Keywords
Graph grammar, semantic graph, meaning representation, graph parsing
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science; computational linguistics
Identifiers
urn:nbn:se:umu:diva-209515 (URN)10.1162/coli_a_00488 (DOI)001152974700005 ()2-s2.0-85173016925 (Scopus ID)
Projects
STING – Synthesis and analysis with Transducers and Invertible Neural Generators
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2020-03852
Available from: 2023-06-10 Created: 2023-06-10 Last updated: 2024-02-19Bibliographically approved
Björklund, J., Drewes, F. & Jonsson, A. (2022). Improved N-Best Extraction with an Evaluation on Language Data. Computational linguistics - Association for Computational Linguistics (Print), 48(1), 119-153
Open this publication in new window or tab >>Improved N-Best Extraction with an Evaluation on Language Data
2022 (English)In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 48, no 1, p. 119-153Article in journal (Refereed) Published
Abstract [en]

We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with respect to M. Compared with the original algorithm, the modifications increase the laziness of the evaluation strategy, which makes the new algorithm asymptotically more efficient than its predecessor. The algorithm is implemented in the software BETTY, and compared to the state-of-the-art algorithm for extracting the N best runs, implemented in the software toolkit TIBURON. The data sets used in the experiments are wtas resulting from real-world natural language processing tasks, as well as artificially created wtas with varying degrees of nondeterminism. We find that BETTY outperforms TIBURON on all tested data sets with respect to running time, while TIBURON seems to be the more memory-efficient choice.

Place, publisher, year, edition, pages
MIT Press, 2022
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-194338 (URN)10.1162/COLI_a_00427 (DOI)000993785100004 ()2-s2.0-85128188225 (Scopus ID)
Available from: 2022-05-04 Created: 2022-05-04 Last updated: 2023-09-05Bibliographically approved
Jiang, L., Jonsson, A. & Vanhée, L. (Eds.). (2022). Proceedings of Umeå's 25th Student Conference in Computing Science (USCCS 2022). Paper presented at Umeå's 25th Student Conference in Computing Science (USCCS 2022). Umeå: Umeå University
Open this publication in new window or tab >>Proceedings of Umeå's 25th Student Conference in Computing Science (USCCS 2022)
2022 (English)Conference proceedings (editor) (Other academic)
Abstract [en]

The Umeå Student Conference in Computing Science (USCCS) is organized annually as part of a course given by the Computing Science department at Umeå University. The objective of the course is to give the students a practical introduction to independent research, scientific writing, and oral presentation.

A student who participates in the course first selects a topic and a research question that they are interested in. If the topic is accepted, the student outlines a paper and composes an annotated bibliography to give a survey of the research topic. The main work consists of conducting the actual research that answers the question asked, and convincingly and clearly reporting the results in a scientific paper. Another major part of the course is multiple internal peer review meetings in which groups of students read each others' papers and give feedback to the author. This process gives valuable training in both giving and receiving criticism in a constructive manner. Altogether, the students learn to formulate and develop their own ideas in a scientific manner, in a process involving internal peer reviewing of each other's work and under supervision of the teachers, and incremental development and refinement of a scientific paper.

Each scientific paper is submitted to USCCS through an on-line submission system, and receives reviews written by members of the Computing Science department. Based on the review, the editors of the conference proceedings (the teachers of the course) issue a decision of preliminary acceptance of the paper to each author. If, after final revision, a paper is accepted, the student is given the opportunity to present the work at the conference. The review process and the conference format aims at mimicking realistic settings for publishing and participation at scientific conferences.

USCCS is the highlight of the course, and this year the conference received 10 submissions, which were carefully reviewed by the teachers of the course. As a result of the reviewing process, 6 submissions were accepted for presentation at the conference.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2022. p. 79
Series
Report / UMINF, ISSN 0348-0542 ; 22.01
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-191144 (URN)
Conference
Umeå's 25th Student Conference in Computing Science (USCCS 2022)
Available from: 2022-01-10 Created: 2022-01-10 Last updated: 2023-03-16Bibliographically approved
Jonsson, A. (2021). Best Trees Extraction and Contextual Grammars for Language Processing. (Doctoral dissertation). Umeå: Umeå universitet
Open this publication in new window or tab >>Best Trees Extraction and Contextual Grammars for Language Processing
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Extrahering av optimala träd samt kontextuella grafgrammatiker för språkbearbetning
Abstract [en]

In natural language processing, the syntax of a sentence refers to the words used in the sentence, their grammatical role, and their order. Semantics concerns the concepts represented by the words in the sentence and their relations, i.e., the meaning of the sentence. While a human can easily analyse a sentence in a language they understand to figure out its grammatical construction and meaning, this is a difficult task for a computer. To analyse natural language, the computer needs a language model. First and foremost, the computer must have data structures that can represent syntax and semantics. Then, the computer requires some information about what is considered correct syntax and semantics – this can be provided in the form of human-annotated corpora of natural language. Computers use formal languages such as programming languages, and our goal is thus to model natural languages using formal languages. There are several ways to capture the correctness aspect of a natural language corpus in a formal language model. One strategy is to specify a formal language using a set of rules that are, in a sense, very similar to the grammatical rules of natural language. In this thesis, we only consider such rule-based formalisms.

Trees are commonly used to represent syntactic analyses of sentences, and graphs can represent the semantics of sentences. Examples of rule-based formalisms that define languages of trees and graphs are tree automata and graph grammars, respectively. When used in language processing, the rules of a formalism are normally given weights, which are then combined as specified by the formalism to assign weights to the trees or graphs in its language. The weights enable us to rank the trees and graphs by their similarity to the linguistic data in the human-annotated corpora. 

Since natural language is very complicated to model, there are many small gaps in the research of natural language processing to address. The research of this thesis considers two separate but related problems: First, we have the N-best problem, which is about finding a number N of top-ranked hypotheses given a ranked hypothesis space. In our case, the hypothesis space is represented by a weighted rule-based formalism, making the hypothesis space a weighted formal language. The hypotheses themselves can for example have the form of weighted syntax trees. The second problem is that of semantic modelling, whose aim is to find a formalism complex enough to define languages of semantic representations. This model can however not be too complex since we still want to be able to efficiently compute solutions to language processing tasks.

This thesis is divided into two parts according to the two problems introduced above. The first part covers the N-best problem for weighted tree automata. In this line of research, we develop and evaluate multiple versions of an efficient algorithm that solves the problem in question. Since our algorithm is the first to do so, we theoretically and experimentally evaluate it in comparison to the state-of-the-art algorithm for solving an easier version of the problem. In the second part, we study how rule-based formalisms can be used to model graphs that represent meaning, i.e., semantic graphs. We investigate an existing formalism and through this work learn what properties of that formalism are necessary for semantic modelling. Finally, we use our new-found knowledge to develop a more specialised formalism, and argue that it is better suited for the task of semantic modelling than existing formalisms.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, 2021. p. 60
Series
Report / UMINF, ISSN 0348-0542 ; 21.04
Keywords
Weighted tree automata, the N-best problem, efficient algorithms, semantic graph, abstract meaning representation, contextual graph grammars, hyperedge replacement, graph extensions
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-182989 (URN)978-91-7855-521-5 (ISBN)978-91-7855-522-2 (ISBN)
Public defence
2021-06-11, MA316, MIT-huset, plan 3, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2021-05-21 Created: 2021-05-11 Last updated: 2021-05-17Bibliographically approved
Björklund, J., Drewes, F. & Jonsson, A. (2018). A Comparison of Two N-Best Extraction Methods for Weighted Tree Automata. In: Implementation and Application of Automata: 23rd International Conference, CIAA 2018, Charlottetown, PE, Canada, July 30 – August 2, 2018, Proceedings. Paper presented at 23rd International Conference on Implementation and Applications of Automata (CIAA 2018), Charlottetown, Canada, July 30-August 2, 2018 (pp. 197-108). Springer
Open this publication in new window or tab >>A Comparison of Two N-Best Extraction Methods for Weighted Tree Automata
2018 (English)In: Implementation and Application of Automata: 23rd International Conference, CIAA 2018, Charlottetown, PE, Canada, July 30 – August 2, 2018, Proceedings, Springer, 2018, p. 197-108Conference paper, Published paper (Refereed)
Abstract [en]

We conduct a comparative study of two state-of-the-art al- gorithms for extracting the N best trees from a weighted tree automaton (wta). The algorithms are Best Trees, which uses a priority queue to structure the search space, and Filtered Runs, which is based on an algorithm by Huang and Chiang that extracts N best runs, implemented as part of the Tiburon wta toolkit. The experiments are run on four data sets, each consisting of a sequence of wtas of increasing sizes. Our conclusion is that Best Trees can be recommended when the input wtas exhibit a high or unpredictable degree of nondeterminism, whereas Filtered Runs is the better option when the input wtas are large but essentially deterministic.

Place, publisher, year, edition, pages
Springer, 2018
Series
Lecture Notes in Computer Science
Keywords
N-best list, tree automaton
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-149994 (URN)10.1007/978-3-319-94812-6_9 (DOI)000469285600009 ()2-s2.0-85051127322 (Scopus ID)978-3-319-94812-6 (ISBN)978-3-319-94811-9 (ISBN)
Conference
23rd International Conference on Implementation and Applications of Automata (CIAA 2018), Charlottetown, Canada, July 30-August 2, 2018
Available from: 2018-06-30 Created: 2018-06-30 Last updated: 2023-03-24Bibliographically approved
Jonsson, A. (2018). Towards semantic language processing. (Licentiate dissertation). Umeå: Department of Computing Science, Umeå University
Open this publication in new window or tab >>Towards semantic language processing
2018 (English)Licentiate thesis, comprehensive summary (Other academic)
Alternative title[sv]
Mot semantisk språkbearbetning
Abstract [en]

The overall goal of the field of natural language processing is to facilitate the communication between humans and computers, and to help humans with natural language problems such as translation. In this thesis, we focus on semantic language processing. Modelling semantics – the meaning of natural language – requires both a structure to hold the semantic information and a device that can enforce rules on the structure to ensure well-formed semantics while not being too computationally heavy. The devices used in natural language processing are preferably weighted to allow for comparison of the alternative semantic interpretations outputted by a device.

The structure employed here is the abstract meaning representation (AMR). We show that AMRs representing well-formed semantics can be generated while leaving out AMRs that are not semantically well-formed. For this purpose, we use a type of graph grammar called contextual hyperedge replacement grammar (CHRG). Moreover, we argue that a more well-known subclass of CHRG – the hyperedge replacement grammar (HRG) – is not powerful enough for AMR generation. This is due to the limitation of HRG when it comes to handling co-references, which in its turn depends on the fact that HRGs only generate graphs of bounded treewidth.

Furthermore, we also address the N best problem, which is as follows: Given a weighted device, return the N best (here: smallest-weighted, or more intuitively, smallest-errored) structures. Our goal is to solve the N best problem for devices capable of expressing sophisticated forms of semantic representations such as CHRGs. Here, however, we merely take a first step consisting in developing methods for solving the N best problem for weighted tree automata and some types of weighted acyclic hypergraphs.

Place, publisher, year, edition, pages
Umeå: Department of Computing Science, Umeå University, 2018. p. 16
Series
Report / UMINF, ISSN 0348-0542 ; 18.12
Keywords
Weighted tree automata, abstract meaning representation, contextual hyperedge replacement grammar, hyperedge replacement grammar, semantic modelling, the N best problem
National Category
Computer Sciences
Research subject
Computer Science; computational linguistics
Identifiers
urn:nbn:se:umu:diva-153738 (URN)978-91-7601-964-1 (ISBN)
Presentation
2018-12-07, MC413, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2018-11-29 Created: 2018-11-28 Last updated: 2018-11-29Bibliographically approved
Drewes, F. & Jonsson, A. (2017). Contextual Hyperedge Replacement Grammars for Abstract Meaning Representations. In: M. Kuhlmann, T. Scheffler (Ed.), Proceedings of the 13th International Workshop on Tree Adjoining Grammars and Related Formalisms: . Paper presented at 13th International Workshop on Tree-Adjoining Grammar and Related Formalisms (TAG+13), Umeå, Sweden, September 4-6, 2017 (pp. 102-111). Association for Computational Linguistics
Open this publication in new window or tab >>Contextual Hyperedge Replacement Grammars for Abstract Meaning Representations
2017 (English)In: Proceedings of the 13th International Workshop on Tree Adjoining Grammars and Related Formalisms / [ed] M. Kuhlmann, T. Scheffler, Association for Computational Linguistics, 2017, p. 102-111Conference paper, Published paper (Refereed)
Abstract [en]

We show how contextual hyperedge replacement grammars can be used to generate abstract meaning representations (AMRs), and argue that they are more suitable for this purpose than hyperedge replacement grammars. Contextual hyperedge replacement turns out to have two advantages over plain hyperedge replacement: it can completely cover the language of all AMRs over a given domain of concepts, and at the same time its grammars become both smaller and simpler.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2017
Keywords
Abstract Meaning Representation, DAG Language, Contextual Hyperedge-Replacement
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-137921 (URN)
Conference
13th International Workshop on Tree-Adjoining Grammar and Related Formalisms (TAG+13), Umeå, Sweden, September 4-6, 2017
Available from: 2017-07-31 Created: 2017-07-31 Last updated: 2021-05-11Bibliographically approved
Björklund, J., Drewes, F. & Jonsson, A. (2017). Finding the N Best Vertices in an Infinite Weighted Hypergraph. Theoretical Computer Science, 682, 30-41
Open this publication in new window or tab >>Finding the N Best Vertices in an Infinite Weighted Hypergraph
2017 (English)In: Theoretical Computer Science, ISSN 0304-3975, E-ISSN 1879-2294, Vol. 682, p. 78p. 30-41Article in journal (Refereed) Published
Abstract [en]

We propose an algorithm for computing the N best vertices in a weighted acyclic hypergraph over a nice semiring. A semiring is nice if it is finitely-generated, idempotent, and has 1 as its minimal element. We then apply the algorithm to the problem of computing the N best trees with respect to a weighted tree automaton, and complement theoretical correctness and complexity arguments with experimental data. The algorithm has several practical applications in natural language processing, for example, to derive the N most likely parse trees with respect to a probabilistic context-free grammar. 

Place, publisher, year, edition, pages
Elsevier, 2017. p. 78
Keywords
Hypergraph, N-best problem, Idempotent semiring
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-132501 (URN)10.1016/j.tcs.2017.03.010 (DOI)000405062100005 ()2-s2.0-85016174936 (Scopus ID)
Note

Special Issue: SI

Available from: 2017-03-15 Created: 2017-03-15 Last updated: 2023-03-24Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-9873-4170

Search in DiVA

Show all publications