Umeå University's logo

umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Dynamic topic modeling by clustering embeddings from pretrained language models: a research proposal
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. Adlede AB, Umeå, Sweden. (Foundations of Language Processing)ORCID-id: 0000-0002-4366-7863
Adlede AB, Umeå, Sweden.ORCID-id: 0000-0001-6601-5190
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. (Foundations of Language Processing)ORCID-id: 0000-0001-7349-7693
2022 (engelsk)Inngår i: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop / [ed] Yan Hanqi; Yang Zonghan; Sebastian Ruder; Wan Xiaojun, Association for Computational Linguistics , 2022, s. 84-91Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

A new trend in topic modeling research is to do Neural Topic Modeling by Clustering document Embeddings (NTM-CE) created with a pretrained language model. Studies have evaluated static NTM-CE models and found them performing comparably to, or even better than other topic models. An important extension of static topic modeling is making the models dynamic, allowing the study of topic evolution over time, as well as detecting emerging and disappearing topics. In this research proposal, we present two research questions to understand dynamic topic modeling with NTM-CE theoretically and practically. To answer these, we propose four phases with the aim of establishing evaluation methods for dynamic topic modeling, finding NTM-CE-specific properties, and creating a framework for dynamic NTM-CE. For evaluation, we propose to use both quantitative measurements of coherence and human evaluation supported by our recently developed tool.

sted, utgiver, år, opplag, sider
Association for Computational Linguistics , 2022. s. 84-91
Emneord [en]
topic modeling, dynamic topic modeling, topic modeling evaluation, research proposal, pretrained language model
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-202486OAI: oai:DiVA.org:umu-202486DiVA, id: diva2:1725510
Konferanse
The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online, November 21-24, 2022
Forskningsfinansiär
Swedish Foundation for Strategic Research, ID190055Tilgjengelig fra: 2023-01-11 Laget: 2023-01-11 Sist oppdatert: 2023-01-11bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Publisher's full text

Person

Eklund, AntonDrewes, Frank

Søk i DiVA

Av forfatter/redaktør
Eklund, AntonForsman, MonaDrewes, Frank
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 234 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf