Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A Comparative Study of MATCH_RECOGNIZE and REGEXP-Based SQL Approaches for Process Querying
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
2025 (Engelska)Självständigt arbete på grundnivå (kandidatexamen), 10 poäng / 15 hpStudentuppsats (Examensarbete)
Abstract [en]

Businesses and organization relies on process querying to analyse data using tools such as process querying languages (PQLs) or SQL. As PQLs introduce certain drawbacks and as SQL remains a standard for handling data, accessing SQLs usefulness for process querying is of practical relevance. The most common ways to express such queries in SQL are with MATCH_RECOGNIZE, and regular expression (REGEXP) based SQL approaches. However, the usefulness of MATCH_RECOGNIZE is still unknown given that SQL have already well-established REGEXP support, and many SQL engines not yet supporting MATCH_RECOGNIZE.

In this thesis we empirically evaluate the performance and scalability off MATCH_RECOGNIZE in comparison to REGEXP-based SQL approaches for process querying by using a simple dataset derived from SIGNAL—a PQL by SAP Signavio—and translating the SIGNAL patterns to both MATCH_RECOGNIZE and REGEXP-based SQL queries. The execution time, CPU usage, and peak-memory usage is measured for each query, and to evaluate the scalability the dataset size is varied using logarithmic scaling e.g., 10%, 25%, 50%,75%, 100% for each query.

The findings of the experiment showed that REGEXP-based SQL approaches outperform MATCH_RECOGNIZE in all metrics, often by a factor of 2. The results did also find that both approaches does scale linearly with increasing data size.

These findings indicate that MATCH_RECOGNIZE might not be the best tool for all process querying task in SQL, especially when using a simple dataset. However, we strongly speculate that MATCH_RECOGNIZE does outperform REGEXP-based SQL approaches as the complexity of the dataset increases.

Ort, förlag, år, upplaga, sidor
2025. , s. 25
Serie
UMNAD ; 1565
Nyckelord [en]
sql, process querying, MATCH_RECOGNIZE, temporal pattern matching, process querying languages, regular expressions
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:umu:diva-240784OAI: oai:DiVA.org:umu-240784DiVA, id: diva2:1973636
Utbildningsprogram
Kandidatprogrammet i Datavetenskap
Handledare
Examinatorer
Tillgänglig från: 2025-06-23 Skapad: 2025-06-19 Senast uppdaterad: 2025-06-23Bibliografiskt granskad

Open Access i DiVA

fulltext(2548 kB)84 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 2548 kBChecksumma SHA-512
36c1736ffbadb484e62833b2d9d51cc2eb0b87e2aaeee5b9f956e0105bd5aa0ddee649f63ad2e263bd041738c7204bbd2af372a1789b1585dd7dd5e73ac4a481
Typ fulltextMimetyp application/pdf

Sök vidare i DiVA

Av författaren/redaktören
Hylander, Daniel
Av organisationen
Institutionen för datavetenskap
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 84 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 356 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf