Evaluating thematic exposures using natural language processing: an unsupervised topic modeling approach to improve classification and mitigate risk
2024 (English)Independent thesis Advanced level (professional degree), 300 HE credits
Student thesis
Abstract [en]
Thematic investments refer to an investment strategy that focuses on specific themes or trends expected to benefit from long-term structural shifts in the economy, society, or the environment. These investments are driven by changes and trends in various areas. Insight into which companies are exposed to specific trends can be used to mitigate risks in portfolio construction while potentially increasing returns. A commonly used classification system is the Global Industry Classification Standard (GICS). While GICS classifies companies in terms of sectors and industries there might be other, more implicit themes that reach beyond such classification structures. Therefore, it is of interest to investigate whether a better classification of companies based on thematic trends can be found by studying their earnings calls, with emphasize on environmental, social and governance (ESG) themes.
In this master's thesis we will examine whether it is possible to use natural language processing and topic modeling, more specifically BERTopic and earnings calls, to gain a better understanding of the thematic trends companies are exposed to. These exposures then form the basis for thematic baskets, and are evaluated to determine if the new cohesion of companies contributes to better classification. This is done by creating a new "industry", added to the original classifications, using the thematic baskets, and measuring the change in average distance of returns between the new and original classification. If the new average distance is greater than the original, a better classification has been identified.
The results show that the new grouping of companies based on their thematic exposure can improve classification, as the average distance between industries has increased and the groups are more internally homogeneous. Therefore, the companies in the baskets are better classified in terms of their price movements since they are exposed to similar thematic trends, albeit with some caveats. This can be used as an overlay on the current classifications and knowledge base, and contributes to possible risk mitigation in portfolio construction and a potential increase in returns.
Place, publisher, year, edition, pages
2024. , p. 55
Keywords [en]
Natural Language Processing, Topic Modeling, Themes, Classification, BERTopic, ESG, Risk mitigation
National Category
Mathematics
Identifiers
URN: urn:nbn:se:umu:diva-228217OAI: oai:DiVA.org:umu-228217DiVA, id: diva2:1887051
External cooperation
Skandinaviska Enskilda Banken
Educational program
Master of Science in Engineering and Management
Presentation
2024-05-30, MIT.346, Umeå, 12:15 (English)
Supervisors
Examiners
2024-08-262024-08-062024-08-26Bibliographically approved