Umeå University's logo

umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
LLM-based Process Constraints Generationwith Context: Automating Conformance Checking and Semantic Anomaly Detection withInstruction Fine-Tuned and Vanilla Large Language Models
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
2025 (engelsk)Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave
Abstract [en]

Analyzing the data generated by complex information systems to identify undesirable behaviors in event log traces---so-called conformance checking---is a key challenge. With the rise of deep learning and, more specifically, generative AI applications, one promising line of research is the auto-generation of (symbolic) temporal reasoning queries that can then be applied in a semi-automatic manner. Recent work has demonstrated that utilizing fine-tuned open-source large language models (LLMs) for this purpose is promising and, in some aspects, superior to other approaches to automated conformance checking.

This thesis further expands this research direction by integrating additional state-of-the-art LLMs, such as GPT-4o and Llama, and supporting the provision of process-specific natural language context to evaluate their effectiveness. While previous framework is designed to work directly with event log schemata, this study explores whether incorporating human-readable text descriptions as supplementary input improves performance for non-fine-tuned models. A naïve baseline is introduced to validate that all models outperform random predictions, ensuring the robustness of the evaluation.

The results show that fine-tuning significantly enhances performance, with the xSemAD model achieving consistently higher F1-scores across most constraint types compared to state-of-the-art LLMs. However, text descriptions did not yield the expected performance improvements, highlighting the complexity of aligning contextual information with process semantics. Additionally, for some inherently ambiguous constraints, such as Choice and Alternate Succession model performance was only marginally better than the naïve baseline. These findings emphasize the importance of task-specific adaptation and the need for advanced methods to address complex constraints.

By demonstrating the potential of fine-tuned LLMs for semantic anomaly detection, this thesis contributes to advancing automated conformance checking and lays the groundwork for future research. Proposed directions include improving textual context generation, exploring alternative ground truth sources, and developing specialized techniques for handling complex constraints.

sted, utgiver, år, opplag, sider
2025. , s. 35
Serie
UMNAD ; 1527
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-236090OAI: oai:DiVA.org:umu-236090DiVA, id: diva2:1942070
Eksternt samarbeid
SAP Signavio
Utdanningsprogram
Master of Science Programme in Computing Science and Engineering
Veileder
Examiner
Tilgjengelig fra: 2025-03-04 Laget: 2025-03-04 Sist oppdatert: 2025-03-04bibliografisk kontrollert

Open Access i DiVA

fulltext(568 kB)138 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 568 kBChecksum SHA-512
9e9466849f802ead85c7e786a59c7385b7a44f6c892fb5e42817990a162898bc85858e8af99c30274053619ae6ac9de3e2959a535e1ae3757d706479526d1997
Type fulltextMimetype application/pdf

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 138 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 497 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf