Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine Learning Algorithms for Proactive Ransomware Threat Hunting
Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis investigated the performance of various classifier and ensemble models, in the context of ransomware detection. The study was done with the objective of making recommendations regarding future proactive threat hunting on Windows OS using machine learning models and dynamic file analysis. Tests were conducted by employing machine learning models trained on file behaviours given during dynamic analysis using a sandbox. The performance of the models was then evaluated using a statistical analysis pertaining to classification outcomes. Future threat hunting recommendations were made based on the results of the statistical evaluation. 

All ensemble models evaluated in this study utilized clustering as a procedure before classification, with the aim of investigating how these ensemble models compared to pure classifiers during evaluation. With regards to existing literature, it was found that previous studies focused either on clustering or classification. As such, investigation into combining clustering and classification was deemed to hold scientific value. These investigations were done through the implementation and evaluation of two pure classification models as well as four ensemble models that combined the same classification algorithms with two clustering algorithms. The classifiers gradient boosting and decision trees were chosen due to high performance in previous research studying the use of machine learning for ransomware detection. Additionally, the clustering algorithms agglomerative clustering and k-means clustering were chosen for the ensemble models. 

Out of all models tested, the model that achieved the highest average scores during evaluation was the gradient boosting classifier model, with an average accuracy of 0.932, average recall of 0.913, average precision of 0.926 and average F1-score of 0.918. However, this model achieved the lowest per class recall for ransomware out of all models, where both ensemble models including the gradient boosting as their classifying algorithm showed a slight boost in ransomware classification performance. The model with the highest per class recall for ransomware was the pure decision trees model, which saw a slight decrease in performance with the addition of clustering as an antecedent process to classification. Overall, the results of the statistical evaluation suggest that more research is needed before any of the models evaluated are ready for real life applications. However, a takeaway from this study is that utilizing clustering as an antecedent process to classification shows potential in possibly enhancing classification outcomes for ransomware, meaning that further research into how ensembles are best applied in ransomware detection is warranted. 

Place, publisher, year, edition, pages
2024. , p. 40
Keywords [en]
Ransomware, Cyber security, Machine learning, Dynamic analysis, Sandbox
National Category
Other Engineering and Technologies Computer Sciences Engineering and Technology
Identifiers
URN: urn:nbn:se:umu:diva-225794OAI: oai:DiVA.org:umu-225794DiVA, id: diva2:1866811
External cooperation
Omegapoint
Subject / course
Examensarbete i Interaktionsteknik och design
Educational program
Master of Science Programme in Interaction Technology and Design - Engineering
Supervisors
Examiners
Available from: 2024-09-12 Created: 2024-06-08 Last updated: 2025-02-18Bibliographically approved

Open Access in DiVA

Machine Learning Algorithms for Proactive Ransomware Threat Hunting(405 kB)333 downloads
File information
File name FULLTEXT01.pdfFile size 405 kBChecksum SHA-512
bd24ff88d47c56861e0a0057ec8a50e0a714914b93bc45332791797e00fc91ad4d832f6e392f8da35296b4fee6385b85915618cc104e8206913d4afc24099a9b
Type fulltextMimetype application/pdf

By organisation
Department of Applied Physics and Electronics
Other Engineering and TechnologiesComputer SciencesEngineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 333 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 217 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf