Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Dynamic topic modeling by clustering embeddings from pretrained language models: a research proposal
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. Adlede AB, Umeå, Sweden. (Foundations of Language Processing)ORCID-id: 0000-0002-4366-7863
Adlede AB, Umeå, Sweden.ORCID-id: 0000-0001-6601-5190
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. (Foundations of Language Processing)ORCID-id: 0000-0001-7349-7693
2022 (Engelska)Ingår i: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop / [ed] Yan Hanqi; Yang Zonghan; Sebastian Ruder; Wan Xiaojun, Association for Computational Linguistics , 2022, s. 84-91Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

A new trend in topic modeling research is to do Neural Topic Modeling by Clustering document Embeddings (NTM-CE) created with a pretrained language model. Studies have evaluated static NTM-CE models and found them performing comparably to, or even better than other topic models. An important extension of static topic modeling is making the models dynamic, allowing the study of topic evolution over time, as well as detecting emerging and disappearing topics. In this research proposal, we present two research questions to understand dynamic topic modeling with NTM-CE theoretically and practically. To answer these, we propose four phases with the aim of establishing evaluation methods for dynamic topic modeling, finding NTM-CE-specific properties, and creating a framework for dynamic NTM-CE. For evaluation, we propose to use both quantitative measurements of coherence and human evaluation supported by our recently developed tool.

Ort, förlag, år, upplaga, sidor
Association for Computational Linguistics , 2022. s. 84-91
Nyckelord [en]
topic modeling, dynamic topic modeling, topic modeling evaluation, research proposal, pretrained language model
Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Identifikatorer
URN: urn:nbn:se:umu:diva-202486OAI: oai:DiVA.org:umu-202486DiVA, id: diva2:1725510
Konferens
The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online, November 21-24, 2022
Forskningsfinansiär
Stiftelsen för strategisk forskning (SSF), ID190055Tillgänglig från: 2023-01-11 Skapad: 2023-01-11 Senast uppdaterad: 2023-01-11Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Publisher's full text

Person

Eklund, AntonDrewes, Frank

Sök vidare i DiVA

Av författaren/redaktören
Eklund, AntonForsman, MonaDrewes, Frank
Av organisationen
Institutionen för datavetenskap
Språkteknologi (språkvetenskaplig databehandling)

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 244 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf