Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluating thematic exposures using natural language processing: an unsupervised topic modeling approach to improve classification and mitigate risk
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
2024 (English)Independent thesis Advanced level (professional degree), 300 HE creditsStudent thesis
Abstract [en]

Thematic investments refer to an investment strategy that focuses on specific themes or trends expected to benefit from long-term structural shifts in the economy, society, or the environment. These investments are driven by changes and trends in various areas. Insight into which companies are exposed to specific trends can be used to mitigate risks in portfolio construction while potentially increasing returns. A commonly used classification system is the Global Industry Classification Standard (GICS). While GICS classifies companies in terms of sectors and industries there might be other, more implicit themes that reach beyond such classification structures. Therefore, it is of interest to investigate whether a better classification of companies based on thematic trends can be found by studying their earnings calls, with emphasize on environmental, social and governance (ESG) themes.

In this master's thesis we will examine whether it is possible to use natural language processing and topic modeling, more specifically BERTopic and earnings calls, to gain a better understanding of the thematic trends companies are exposed to. These exposures then form the basis for thematic baskets, and are evaluated to determine if the new cohesion of companies contributes to better classification. This is done by creating a new "industry", added to the original classifications, using the thematic baskets, and measuring the change in average distance of returns between the new and original classification. If the new average distance is greater than the original, a better classification has been identified. 

The results show that the new grouping of companies based on their thematic exposure can improve classification, as the average distance between industries has increased and the groups are more internally homogeneous. Therefore, the companies in the baskets are better classified in terms of their price movements since they are exposed to similar thematic trends, albeit with some caveats. This can be used as an overlay on the current classifications and knowledge base, and contributes to possible risk mitigation in portfolio construction and a potential increase in returns. 

Place, publisher, year, edition, pages
2024. , p. 55
Keywords [en]
Natural Language Processing, Topic Modeling, Themes, Classification, BERTopic, ESG, Risk mitigation
National Category
Mathematics
Identifiers
URN: urn:nbn:se:umu:diva-228217OAI: oai:DiVA.org:umu-228217DiVA, id: diva2:1887051
External cooperation
Skandinaviska Enskilda Banken
Educational program
Master of Science in Engineering and Management
Presentation
2024-05-30, MIT.346, Umeå, 12:15 (English)
Supervisors
Examiners
Available from: 2024-08-26 Created: 2024-08-06 Last updated: 2024-08-26Bibliographically approved

Open Access in DiVA

Järvenstrand_Åström(2322 kB)293 downloads
File information
File name FULLTEXT01.pdfFile size 2322 kBChecksum SHA-512
014597f7e39c2ee0a14fd284b1adb844226008ac67a319d9a582c58dc792c8f41252950cf5a2a6bdf62b76199b60c3f8970072eab7ff247625178b74b6d24e0f
Type fulltextMimetype application/pdf

By organisation
Department of Mathematics and Mathematical Statistics
Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 293 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 789 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf