# Umeå University's logo

umu.sePublikasjoner
Endre søk

##### Publikasjoner (10 av 112) Visa alla publikasjoner
Jäger, G. & Drewes, F. (2024). Optimal strategies for the static black-peg AB game with two and three pegs. Discrete Mathematics, Algorithms and Applications (DMAA), 16(4), Article ID 2350049.
Åpne denne publikasjonen i ny fane eller vindu >>Optimal strategies for the static black-peg AB game with two and three pegs
2024 (engelsk)Inngår i: Discrete Mathematics, Algorithms and Applications (DMAA), ISSN 1793-8309, E-ISSN 1793-8317, Vol. 16, nr 4, artikkel-id 2350049Artikkel i tidsskrift (Fagfellevurdert) Published
##### Abstract [en]

The AB Game is a game similar to the popular game Mastermind. We study a version of this game called Static Black-Peg AB Game. It is played by two players, the codemaker and the codebreaker. The codemaker creates a so-called secret by placing a color from a set of c colors on each of p ≤ c pegs, subject to the condition that every color is used at most once. The codebreaker tries to determine the secret by asking questions, where all questions are given at once and each question is a possible secret. As an answer the codemaker reveals the number of correctly placed colors for each of the questions. After that, the codebreaker only has one more try to determine the secret and thus to win the game.

For given p and c, our goal is to find the smallest number k of questions the codebreaker needs to win, regardless of the secret, and the corresponding list of questions, called a (k + 1)-strategy. We present a (⌈4c/3⌉ − 1)-strategy for p = 2 for all c ≥ 2, and a ⌊(3c − 1)/2⌋-strategy for p = 3 for all c ≥ 4 and show the optimality of both strategies, i.e., we prove that no (k + 1)-strategy for a smaller k exists.

##### sted, utgiver, år, opplag, sider
World Scientific, 2024
##### Emneord
Game theory, mastermind, AB game, optimal strategy
matematik
##### Identifikatorer
urn:nbn:se:umu:diva-210346 (URN)10.1142/s1793830923500490 (DOI)001034748600002 ()2-s2.0-85165934499 (Scopus ID)
##### Forskningsfinansiär
The Kempe Foundations, JCK-2022.1 Tilgjengelig fra: 2023-06-20 Laget: 2023-06-20 Sist oppdatert: 2024-06-26bibliografisk kontrollert
Hatefi, A., Vu, X.-S., Bhuyan, M. H. & Drewes, F. (2023). ADCluster: Adaptive Deep Clustering for unsupervised learning from unlabeled documents. In: Mourad Abbas; Abed Alhakim Freihat (Ed.), Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023): . Paper presented at 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023), Online, December 16-17, 2023. (pp. 68-77). Association for Computational Linguistics
Åpne denne publikasjonen i ny fane eller vindu >>ADCluster: Adaptive Deep Clustering for unsupervised learning from unlabeled documents
2023 (engelsk)Inngår i: Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023) / [ed] Mourad Abbas; Abed Alhakim Freihat, Association for Computational Linguistics, 2023, s. 68-77Konferansepaper, Publicerat paper (Fagfellevurdert)

##### sted, utgiver, år, opplag, sider
Association for Computational Linguistics, 2023
##### Emneord
deep clustering, adaptive, deep learning, unsupervised, data stream
##### Forskningsprogram
datalogi; datorlingvistik
##### Identifikatorer
urn:nbn:se:umu:diva-220260 (URN)
##### Konferanse
6th International Conference on Natural Language and Speech Processing (ICNLSP 2023), Online, December 16-17, 2023.
Tilgjengelig fra: 2024-01-31 Laget: 2024-01-31 Sist oppdatert: 2024-07-02bibliografisk kontrollert
Eklund, A., Forsman, M. & Drewes, F. (2023). An empirical configuration study of a common document clustering pipeline. Northern European Journal of Language Technology (NEJLT), 9(1)
Åpne denne publikasjonen i ny fane eller vindu >>An empirical configuration study of a common document clustering pipeline
2023 (engelsk)Inngår i: Northern European Journal of Language Technology (NEJLT), ISSN 2000-1533, Vol. 9, nr 1Artikkel i tidsskrift (Fagfellevurdert) Published
##### Abstract [en]

Document clustering is frequently used in applications of natural language processing, e.g. to classify news articles or create topic models. In this paper, we study document clustering with the common clustering pipeline that includes vectorization with BERT or Doc2Vec, dimension reduction with PCA or UMAP, and clustering with K-Means or HDBSCAN. We discuss the inter- actions of the different components in the pipeline, parameter settings, and how to determine an appropriate number of dimensions. The results suggest that BERT embeddings combined with UMAP dimension reduction to no less than 15 dimensions provides a good basis for clustering, regardless of the specific clustering algorithm used. Moreover, while UMAP performed better than PCA in our experiments, tuning the UMAP settings showed little impact on the overall performance. Hence, we recommend configuring UMAP so as to optimize its time efficiency. According to our topic model evaluation, the combination of BERT and UMAP, also used in BERTopic, performs best. A topic model based on this pipeline typically benefits from a large number of clusters.

##### sted, utgiver, år, opplag, sider
Linköping University Electronic Press, 2023
##### Emneord
document clustering, topic modeling, dimension reduction, clustering, BERT, doc2vec, UMAP, PCA, K-Means, HDBSCAN
##### Identifikatorer
urn:nbn:se:umu:diva-214455 (URN)10.3384/nejlt.2000-1533.2023.4396 (DOI)
##### Forskningsfinansiär
Swedish Foundation for Strategic Research, ID19-0055 Tilgjengelig fra: 2023-09-15 Laget: 2023-09-15 Sist oppdatert: 2023-09-15bibliografisk kontrollert
Andersson, E., Björklund, J., Drewes, F. & Jonsson, A. (2023). Generating semantic graph corpora with graph expansion grammar. In: Nagy B., Freund R. (Ed.), 13th International Workshop on Non-Classical Models of Automata and Applications (NCMA 2023): . Paper presented at 13th International Workshop on Non-Classical Models of Automata and Applications, NCMA 2023, 18-19 September, 2023, Famagusta, Cyprus (pp. 3-15). Open Publishing Association, 388
Åpne denne publikasjonen i ny fane eller vindu >>Generating semantic graph corpora with graph expansion grammar
2023 (engelsk)Inngår i: 13th International Workshop on Non-Classical Models of Automata and Applications (NCMA 2023) / [ed] Nagy B., Freund R., Open Publishing Association , 2023, Vol. 388, s. 3-15Konferansepaper, Publicerat paper (Fagfellevurdert)
##### Abstract [en]

We introduce LOVELACE, a tool for creating corpora of semantic graphs.The system uses graph expansion grammar as  a representational language, thus allowing users to craft a grammar that describes a corpus with desired properties. When given such grammar as input, the system generates a set of output graphs that are well-formed according to the grammar, i.e., a graph bank.The generation process can be controlled via a number of configurable parameters that allow the user to, for example, specify a range of desired output graph sizes.Central use cases are the creation of synthetic data to augment existing corpora, and as a pedagogical tool for teaching formal language theory.

##### sted, utgiver, år, opplag, sider
Open Publishing Association, 2023
##### Serie
Electronic Proceedings in Theoretical Computer Science, ISSN 2075-2180
##### Emneord
semantic representation, graph corpora, graph grammar
datalogi
##### Identifikatorer
urn:nbn:se:umu:diva-212143 (URN)10.4204/EPTCS.388.3 (DOI)2-s2.0-85173059788 (Scopus ID)
##### Konferanse
13th International Workshop on Non-Classical Models of Automata and Applications, NCMA 2023, 18-19 September, 2023, Famagusta, Cyprus
##### Forskningsfinansiär
Swedish Research Council, 2020-03852 Tilgjengelig fra: 2023-07-18 Laget: 2023-07-18 Sist oppdatert: 2023-10-18bibliografisk kontrollert
Björklund, J., Drewes, F. & Jonsson, A. (2023). Generation and polynomial parsing of graph languages with non-structural reentrancies. Computational linguistics - Association for Computational Linguistics (Print), 49(4), 841-882
Åpne denne publikasjonen i ny fane eller vindu >>Generation and polynomial parsing of graph languages with non-structural reentrancies
2023 (engelsk)Inngår i: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 49, nr 4, s. 841-882Artikkel i tidsskrift (Fagfellevurdert) Published
##### Abstract [en]

Graph-based semantic representations are popular in natural language processing (NLP), where it is often convenient to model linguistic concepts as nodes and relations as edges between them. Several attempts have been made to find a generative device that is sufficiently powerful to describe languages of semantic graphs, while at the same allowing efficient parsing. We contribute to this line of work by introducing graph extension grammar, a variant of the contextual hyperedge replacement grammars proposed by Hoffmann et al. Contextual hyperedge replacement can generate graphs with non-structural reentrancies, a type of node-sharing that is very common in formalisms such as abstract meaning representation, but which context-free types of graph grammars cannot model. To provide our formalism with a way to place reentrancies in a linguistically meaningful way, we endow rules with logical formulas in counting monadic second-order logic. We then present a parsing algorithm and show as our main result that this algorithm runs in polynomial time on graph languages generated by a subclass of our grammars, the so-called local graph extension grammars.

##### sted, utgiver, år, opplag, sider
Association for Computational Linguistics, 2023
##### Emneord
Graph grammar, semantic graph, meaning representation, graph parsing
##### Forskningsprogram
datalogi; datorlingvistik
##### Identifikatorer
urn:nbn:se:umu:diva-209515 (URN)10.1162/coli_a_00488 (DOI)001152974700005 ()2-s2.0-85173016925 (Scopus ID)
##### Prosjekter
STING – Synthesis and analysis with Transducers and Invertible Neural Generators
##### Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2020-03852 Tilgjengelig fra: 2023-06-10 Laget: 2023-06-10 Sist oppdatert: 2024-02-19bibliografisk kontrollert
Drewes, F., Mörbitz, R. & Vogler, H. (2023). Hybrid tree automata and the yield theorem for constituent tree automata. Theoretical Computer Science, 979, Article ID 114185.
Åpne denne publikasjonen i ny fane eller vindu >>Hybrid tree automata and the yield theorem for constituent tree automata
2023 (engelsk)Inngår i: Theoretical Computer Science, ISSN 0304-3975, E-ISSN 1879-2294, Vol. 979, artikkel-id 114185Artikkel i tidsskrift (Fagfellevurdert) Published
##### Abstract [en]

We introduce an automaton model for recognizing sets of hybrid trees, the hybrid tree automaton (HTA). Special cases of hybrid trees are constituent trees and dependency trees, as they occur in natural language processing. This includes the cases of discontinuous constituent trees and non-projective dependency trees. In general, a hybrid tree is a tree over a ranked alphabet in which a symbol can additionally be equipped with a natural number, called index; in a hybrid tree, each index occurs at most once. The yield of a hybrid tree is a sequence of strings over those symbols which occur in an indexed form; the corresponding indices determine the order within these strings; the borders between two consecutive strings are determined by the gaps in the sequence of indices. As a special case of HTA, we define constituent tree automata (CTA) which recognize sets of constituent trees. We introduce the notion of CTA-inductively recognizable and we show that the set of yields of a CTA-inductively recognizable set of constituent trees is an LCFRS language, and vice versa.

Elsevier, 2023
##### Emneord
Constituent tree, Constituent tree automata, Hybrid tree, Hybrid tree automata, Linear context-free rewriting system
##### Identifikatorer
urn:nbn:se:umu:diva-215370 (URN)10.1016/j.tcs.2023.114185 (DOI)2-s2.0-85173478895 (Scopus ID)
Tilgjengelig fra: 2023-10-31 Laget: 2023-10-31 Sist oppdatert: 2023-11-01bibliografisk kontrollert
Drewes, F. & Volkov, M. (2023). Preface. In: Frank Drewes; Mikhail Volkov (Ed.), Developments in language theory: 27th International Conference, DLT 2023 Umeå, Sweden, June 12–16, 2023 Proceedings. Springer Science+Business Media B.V.
Åpne denne publikasjonen i ny fane eller vindu >>Preface
2023 (engelsk)Inngår i: Developments in language theory: 27th International Conference, DLT 2023 Umeå, Sweden, June 12–16, 2023 Proceedings / [ed] Frank Drewes; Mikhail Volkov, Springer Science+Business Media B.V., 2023Kapittel i bok, del av antologi (Annet vitenskapelig)
##### sted, utgiver, år, opplag, sider
Springer Science+Business Media B.V., 2023
##### Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13911
##### Identifikatorer
urn:nbn:se:umu:diva-210220 (URN)2-s2.0-85161250270 (Scopus ID)978-3-031-33263-0 (ISBN)978-3-031-33264-7 (ISBN)
Tilgjengelig fra: 2023-06-28 Laget: 2023-06-28 Sist oppdatert: 2023-06-28bibliografisk kontrollert
Drewes, F., Hoffmann, B. & Minas, M. (2022). Acyclic contextual hyperedge replacement: decidability of acyclicity and generative power. In: Nicolas Behr; Daniel Strüber (Ed.), Graph Transformation: 15th International Conference, ICGT 2022, Held as Part of STAF 2022, Nantes, France, July 7–8, 2022, Proceedings. Paper presented at 15th International Conference on Graph Transformation, ICGT 2022, held as Part of STAF 2022, Nantes, France, July 7-8, 2022. (pp. 3-19). Cham: Springer
Åpne denne publikasjonen i ny fane eller vindu >>Acyclic contextual hyperedge replacement: decidability of acyclicity and generative power
2022 (engelsk)Inngår i: Graph Transformation: 15th International Conference, ICGT 2022, Held as Part of STAF 2022, Nantes, France, July 7–8, 2022, Proceedings / [ed] Nicolas Behr; Daniel Strüber, Cham: Springer, 2022, s. 3-19Konferansepaper, Publicerat paper (Fagfellevurdert)
##### Abstract [en]

Graph grammars based on contextual hyperedge replacement (CHR)  extend the generative power of the well-known hyperedge replacement  (HR) grammars to an extent that makes them useful for practical  modeling. Recent work has shown that acyclicity is a key condition for  parsing CHR grammars efficiently. In this paper we show that acyclicity of CHR grammars is decidable  and that the generative power of  acyclic CHR grammars lies strictly between that of HR grammars and  unrestricted CHR grammars.

##### sted, utgiver, år, opplag, sider
Cham: Springer, 2022
##### Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13349
##### Emneord
Graph grammar, Hyperedge replacement, Contextual hyperedge replacement, Acyclicity, Decidability, Generative power
datalogi
##### Identifikatorer
urn:nbn:se:umu:diva-194089 (URN)10.1007/978-3-031-09843-7_1 (DOI)000870300100001 ()2-s2.0-85135030138 (Scopus ID)978-3-031-09842-0 (ISBN)978-3-031-09843-7 (ISBN)
##### Konferanse
15th International Conference on Graph Transformation, ICGT 2022, held as Part of STAF 2022, Nantes, France, July 7-8, 2022.
Tilgjengelig fra: 2022-04-24 Laget: 2022-04-24 Sist oppdatert: 2023-09-05bibliografisk kontrollert
Eklund, A., Forsman, M. & Drewes, F. (2022). Dynamic topic modeling by clustering embeddings from pretrained language models: a research proposal. In: Yan Hanqi; Yang Zonghan; Sebastian Ruder; Wan Xiaojun (Ed.), Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop: . Paper presented at The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online, November 21-24, 2022 (pp. 84-91). Association for Computational Linguistics
Åpne denne publikasjonen i ny fane eller vindu >>Dynamic topic modeling by clustering embeddings from pretrained language models: a research proposal
2022 (engelsk)Inngår i: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop / [ed] Yan Hanqi; Yang Zonghan; Sebastian Ruder; Wan Xiaojun, Association for Computational Linguistics , 2022, s. 84-91Konferansepaper, Publicerat paper (Fagfellevurdert)
##### Abstract [en]

A new trend in topic modeling research is to do Neural Topic Modeling by Clustering document Embeddings (NTM-CE) created with a pretrained language model. Studies have evaluated static NTM-CE models and found them performing comparably to, or even better than other topic models. An important extension of static topic modeling is making the models dynamic, allowing the study of topic evolution over time, as well as detecting emerging and disappearing topics. In this research proposal, we present two research questions to understand dynamic topic modeling with NTM-CE theoretically and practically. To answer these, we propose four phases with the aim of establishing evaluation methods for dynamic topic modeling, finding NTM-CE-specific properties, and creating a framework for dynamic NTM-CE. For evaluation, we propose to use both quantitative measurements of coherence and human evaluation supported by our recently developed tool.

##### sted, utgiver, år, opplag, sider
Association for Computational Linguistics, 2022
##### Emneord
topic modeling, dynamic topic modeling, topic modeling evaluation, research proposal, pretrained language model
##### Identifikatorer
urn:nbn:se:umu:diva-202486 (URN)
##### Konferanse
The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online, November 21-24, 2022
##### Forskningsfinansiär
Swedish Foundation for Strategic Research, ID190055 Tilgjengelig fra: 2023-01-11 Laget: 2023-01-11 Sist oppdatert: 2023-01-11bibliografisk kontrollert
Drewes, F., Mörbitz, R. & Vogler, H. (2022). Hybrid Tree Automata and the Yield Theorem for Constituent Tree Automata. In: Caron P.; Mignot L. (Ed.), Implementation and Application of Automata (CIAA 2022): 26th International Conference, CIAA 2022, Rouen, France, June 28 – July 1, 2022, Proceedings. Paper presented at CIAA 2022: 26th International Conference on Implementation and Application of Automata, Rouen, France, June 28 - July 1, 2022 (pp. 93-105). Springer Science+Business Media B.V.
Åpne denne publikasjonen i ny fane eller vindu >>Hybrid Tree Automata and the Yield Theorem for Constituent Tree Automata
2022 (engelsk)Inngår i: Implementation and Application of Automata (CIAA 2022): 26th International Conference, CIAA 2022, Rouen, France, June 28 – July 1, 2022, Proceedings / [ed] Caron P.; Mignot L., Springer Science+Business Media B.V., 2022, s. 93-105Konferansepaper, Publicerat paper (Fagfellevurdert)
##### Abstract [en]

We introduce an automaton model for recognizing sets of hybrid trees, the hybrid tree automaton (HTA).    Special cases of hybrid trees are constituent trees and dependency trees, as they occur in natural language processing.    This includes the cases of discontinuous  constituent trees and non-projective dependency trees.   In general, a hybrid tree is a tree over a ranked alphabet %of symbols and indexed symbols, an indexed symbol being a symbol paired with   in which symbols can additionally be equipped with an index,   i.e.,   a natural number which indicates the position of that symbol in the yield of the hybrid tree.    As a special case of HTA, we define constituent tree automata (CTA) which recognize sets of  constituent trees. We show that the set of yields of a CTA-recognizable set of constituent trees is an LCFRS language, and vice versa.

##### sted, utgiver, år, opplag, sider
Springer Science+Business Media B.V., 2022
##### Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13266
##### Emneord
tree language, constituent tree, dependency tree, hybrid tree
datalogi
##### Identifikatorer
urn:nbn:se:umu:diva-194090 (URN)10.1007/978-3-031-07469-1_7 (DOI)000876366600007 ()2-s2.0-85131956678 (Scopus ID)978-3-031-07468-4 (ISBN)978-3-031-07469-1 (ISBN)
##### Konferanse
CIAA 2022: 26th International Conference on Implementation and Application of Automata, Rouen, France, June 28 - July 1, 2022
Tilgjengelig fra: 2022-04-24 Laget: 2022-04-24 Sist oppdatert: 2022-12-30bibliografisk kontrollert
##### Prosjekter
Trädautomater för datoriserad språkteknologi [2008-06074_VR]; Umeå universitet
##### Identifikatorer
ORCID-id: orcid.org/0000-0001-7349-7693

#### Søk i DiVA

Visa alla publikasjoner