Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Dynamic topic modeling by clustering embeddings from pretrained language models: a research proposal
Umeå University, Faculty of Science and Technology, Department of Computing Science. Adlede AB, Umeå, Sweden. (Foundations of Language Processing)ORCID iD: 0000-0002-4366-7863
Adlede AB, Umeå, Sweden.ORCID iD: 0000-0001-6601-5190
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Foundations of Language Processing)ORCID iD: 0000-0001-7349-7693
2022 (English)In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop / [ed] Yan Hanqi; Yang Zonghan; Sebastian Ruder; Wan Xiaojun, Association for Computational Linguistics , 2022, p. 84-91Conference paper, Published paper (Refereed)
Abstract [en]

A new trend in topic modeling research is to do Neural Topic Modeling by Clustering document Embeddings (NTM-CE) created with a pretrained language model. Studies have evaluated static NTM-CE models and found them performing comparably to, or even better than other topic models. An important extension of static topic modeling is making the models dynamic, allowing the study of topic evolution over time, as well as detecting emerging and disappearing topics. In this research proposal, we present two research questions to understand dynamic topic modeling with NTM-CE theoretically and practically. To answer these, we propose four phases with the aim of establishing evaluation methods for dynamic topic modeling, finding NTM-CE-specific properties, and creating a framework for dynamic NTM-CE. For evaluation, we propose to use both quantitative measurements of coherence and human evaluation supported by our recently developed tool.

Place, publisher, year, edition, pages
Association for Computational Linguistics , 2022. p. 84-91
Keywords [en]
topic modeling, dynamic topic modeling, topic modeling evaluation, research proposal, pretrained language model
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:umu:diva-202486OAI: oai:DiVA.org:umu-202486DiVA, id: diva2:1725510
Conference
The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online, November 21-24, 2022
Funder
Swedish Foundation for Strategic Research, ID190055Available from: 2023-01-11 Created: 2023-01-11 Last updated: 2023-01-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Eklund, AntonDrewes, Frank

Search in DiVA

By author/editor
Eklund, AntonForsman, MonaDrewes, Frank
By organisation
Department of Computing Science
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 235 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf