Umeå University's logo

umu.sePublications
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Trustworthy machine learning
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-1277-9811
2025 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Tillförlitlig maskininlärning (Swedish)
Abstract [sv]

Denna avhandling studerar robusthet, integritet och reproducerbarhet i säker-hetskritisk maskininlärning, med särskild tonvikt på datorseende, avvikelse-detektering och undvikande attacker.

Arbetet inleds med att analysera de praktiska kostnaderna och fördelarna med försvarsstrategier mot attacker, vilket visar att vanliga mått på robusthet är dåliga indikatorer på verklig prestanda i attacker (Artikel I). Genom storskaliga experiment visar arbetet vidare att exempel på attacker ofta kan genereras i linjär tid, vilket ger angripare en beräkningsfördel gentemot försvar-are (Artikel II). För att hantera detta presenterar avhandlingen ett nytt mått – Training Rate and Survival Heuristic (TRASH) – för att förutsäga modellfel under attack och underlätta tidigt avvisande av sårbara arkitekturer (Artikel III). Detta mått utvidgades sedan till verkliga kostnader, vilket visar att robusthet i attacker kan förbättras med hjälp av billig hårdvara med låg precision utan att offra noggrannheten (Artikel IV).

Utöver robusthet behandlar avhandlingen integritet genom att utforma en lättviktig klientbaserad modell för spamdetektering som bevarar användardata och står emot flera klasser av attacker utan att kräva att beräkningar görs på serversidan (Artikel V). Som svar på behovet av reproducerbara och gransk-ningsbara experiment i säkerhetskritiska sammanhang presenterar avhandlingen även “deckard”, ett deklarativt programvaruramverk för distribuerade och robusta maskininlärningsexperiment (Artikel VI).

Tillsammans erbjuder dessa bidrag empiriska tekniker för att utvärdera och förbättra modellers robusthet, föreslår en integritetsbevarande klassificeringsstrategi och levererar praktiska verktyg för reproducerbara experiment. Sammantaget främjar avhandlingen målet att bygga maskininlärningssystem som inte bara är korrekta, utan också robusta, reproducerbara och pålitliga.

Abstract [en]

This thesis studies adversarial robustness, privacy, and reproducibility in safety critical machine learning systems, with particular emphasis on computer vision, anomaly detection, and evasion attacks through a series of papers. The work begins by analysing the practical costs and benefits of defence strategies against adversarial attacks, revealing that common robustness metrics are poor indicators of real-world adversarial performance (Paper I). Through large-scale experiments, it further demonstrates that adversarial examples can often be generated in linear time, granting attackers a computational advantage over defenders (Paper II). To address this, a novel metric—the Training Rate and Survival Heuristic (TRASH)—was developed to predict model failure under attack and facilitate early rejection of vulnerable architectures (Paper III). This metric was then extended to real-world cost, showing that adversarial robustness can be improved using low-cost, low-precision hardware without sacrificing accuracy (Paper IV). Beyond robustness, the thesis tackles privacy by designing a lightweight, client-side spam detection model that preserves user data and resists several classes of attacks without requiring server-side computation (Paper V). Recognizing the need for reproducible and auditable experiments in safety-critical contexts, the thesis also presents deckard, a declarative software frameworkfor distributed and robust machine learning experimentation (Paper VI). Together, these contributions offer empirical techniques for evaluating and improving model robustness, propose a privacy-preserving classification strategy, and deliver practical tooling for reproducible experimentation. Ultimately, this thesis advances the goal of building machine learning systems that are not only accurate, but also robust, reproducible, and trustworthy.

Place, publisher, year, edition, pages
Umeå, Sweden: Umeå University, 2025. , p. 66
Publication channel
978-91-8070-722-0
Series
Report / UMINF, ISSN 0348-0542 ; 25.10
Keywords [en]
Machine Learning, Adversarial Machine Learning, Anomaly Detection, Computer Vision, Robustness, Artificial Intelligence, Trustworthy Machine Learning
Keywords [sv]
Adversariell maskininlärning, anomalidetektering, artificiell intelligens, datorseende, maskininlärning, robusthet, tillförlitlig maskininlärning
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-238928ISBN: 978-91-8070-722-0 (print)ISBN: 978-91-8070-723-7 (electronic)OAI: oai:DiVA.org:umu-238928DiVA, id: diva2:1958910
Public defence
2025-06-11, UB.A.230 - Lindellhallen 3, Universitetstorget 4, Umeå, Sweden, 13:00 (English)
Opponent
Supervisors
Funder
Knut and Alice Wallenberg Foundation, 2019.035Available from: 2025-05-21 Created: 2025-05-16 Last updated: 2025-05-19Bibliographically approved
List of papers
1. Safety-critical computer vision: an empirical survey of adversarial evasion attacks and defenses on computer vision systems
Open this publication in new window or tab >>Safety-critical computer vision: an empirical survey of adversarial evasion attacks and defenses on computer vision systems
2023 (English)In: Artificial Intelligence Review, ISSN 0269-2821, E-ISSN 1573-7462, Vol. 56, p. 217-251Article in journal (Refereed) Published
Abstract [en]

Considering the growing prominence of production-level AI and the threat of adversarial attacks that can poison a machine learning model against a certain label, evade classification, or reveal sensitive data about the model and training data to an attacker, adversaries pose fundamental problems to machine learning systems. Furthermore, much research has focused on the inverse relationship between robustness and accuracy, raising problems for real-time and safety-critical systems particularly since they are governed by legal constraints in which software changes must be explainable and every change must be thoroughly tested. While many defenses have been proposed, they are often computationally expensive and tend to reduce model accuracy. We have therefore conducted a large survey of attacks and defenses and present a simple and practical framework for analyzing any machine-learning system from a safety-critical perspective using adversarial noise to find the upper bound of the failure rate. Using this method, we conclude that all tested configurations of the ResNet architecture fail to meet any reasonable definition of ‘safety-critical’ when tested on even small-scale benchmark data. We examine state of the art defenses and attacks against computer vision systems with a focus on safety-critical applications in autonomous driving, industrial control, and healthcare. By testing a combination of attacks and defenses, their efficacy, and their run-time requirements, we provide substantial empirical evidence that modern neural networks consistently fail to meet established safety-critical standards by a wide margin.

Place, publisher, year, edition, pages
Elsevier, 2023
Keywords
Adversarial machine learning, Computer vision, Autonomous vehicles, Safety-critical
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-211212 (URN)10.1007/s10462-023-10521-4 (DOI)001014695900002 ()2-s2.0-85162639161 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation, 2019.0352
Available from: 2023-06-29 Created: 2023-06-29 Last updated: 2025-05-19Bibliographically approved
2. Massively parallel evasion attacks and the pitfalls of adversarial retraining
Open this publication in new window or tab >>Massively parallel evasion attacks and the pitfalls of adversarial retraining
2024 (English)In: EAI Endorsed Transactions on Internet of Things, E-ISSN 2414-1399, Vol. 10Article in journal (Refereed) Published
Abstract [en]

Even with widespread adoption of automated anomaly detection in safety-critical areas, both classical and advanced machine learning models are susceptible to first-order evasion attacks that fool models at run-time (e.g. an automated firewall or an anti-virus application). Kernelized support vector machines (KSVMs) are an especially useful model because they combine a complex geometry with low run-time requirements (e.g. when compared to neural networks), acting as a run-time lower bound when compared to contemporary models (e.g. deep neural networks), to provide a cost-efficient way to measure model and attack run-time costs. To properly measure and combat adversaries, we propose a massively parallel projected gradient descent (PGD) evasion attack framework. Through theoretical examinations and experiments carried out using linearly-separable Gaussian normal data, we present (i) a massively parallel naive attack, we show that adversarial retraining is unlikely to be an effective means to combat an attacker even on linearly separable datasets, (ii) a cost effective way of evaluating models defences and attacks, and an extensible code base for doing so, (iii) an inverse relationship between adversarial robustness and benign accuracy, (iv) the lack of a general relationship between attack time and efficacy, and (v) that adversarial retraining increases compute time exponentially while failing to reliably prevent highly-confident false classifications.

Place, publisher, year, edition, pages
Gent EAI, 2024
Keywords
Machine Learning, Support Vector Machines, Trustworthy AI, Anomaly Detection, AI for Cybersecurity
National Category
Computer graphics and computer vision Computer and Information Sciences
Identifiers
urn:nbn:se:umu:diva-228214 (URN)10.4108/eetiot.6652 (DOI)2-s2.0-85200255571 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation, 2019.0352
Available from: 2024-08-05 Created: 2024-08-05 Last updated: 2025-05-19Bibliographically approved
3. A training rate and survival heuristic for inference and robustness evaluation (Trashfire)
Open this publication in new window or tab >>A training rate and survival heuristic for inference and robustness evaluation (Trashfire)
2025 (English)In: Proceedings of 2024 International Conference on Machine Learning and Cybernetics, IEEE, 2025, p. 613-623Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning models—deep neural networks in particular—have performed remarkably well on benchmark datasets across a wide variety of domains. However, the ease of finding adversarial counter-examples remains a persistent problem when training times are measured in hours or days and the time needed to find a successful adversarial counter-example is measured in seconds. Much work has gone into generating and defending against these adversarial counter-examples, however the relative costs of attacks and defences are rarely discussed. Additionally, machine learning research is almost entirely guided by test/train metrics, but these would require billions of samples to meet industry standards. The present work addresses the problem of understanding and predicting how particular model hyper-parameters influence the performance of a model in the presence of an adversary. The proposed approach uses survival models, worst-case examples, and a cost-aware analysis to precisely and accurately reject a particular model change during routine model training procedures rather than relying on real-world deployment, expensive formal verification methods, or accurate simulations of very complicated systems (e.g., digitally recreating every part of a car or a plane). Through an evaluation of many pre-processing techniques, adversarial counter-examples, and neural network configurations, the conclusion is that deeper models do offer marginal gains in survival times compared to more shallow counterparts. However, we show that those gains are driven more by the model inference time than inherent robustness properties. Using the proposed methodology, we show that ResNet is hopelessly insecure against even the simplest of white box attacks.

Place, publisher, year, edition, pages
IEEE, 2025
Series
Proceedings (International Conference on Machine Learning and Cybernetics), ISSN 2160-133X, E-ISSN 2160-1348
Keywords
Machine Learning, Computer Vision, Neural Networks, Adversarial AI, Trustworthy AI
National Category
Artificial Intelligence Security, Privacy and Cryptography Computer Sciences
Identifiers
urn:nbn:se:umu:diva-237109 (URN)10.1109/ICMLC63072.2024.10935101 (DOI)2-s2.0-105002274020 (Scopus ID)9798331528041 (ISBN)9798331528058 (ISBN)
Conference
2024 International Conference on Machine Learning and Cybernetics (ICMLC),Miyazaki, Japan, September 20-23,
Funder
Knut and Alice Wallenberg Foundation, 2019.0352eSSENCE - An eScience Collaboration
Available from: 2025-04-02 Created: 2025-04-02 Last updated: 2025-05-19Bibliographically approved
4. A cost-aware approach to adversarial robustness in neural networks
Open this publication in new window or tab >>A cost-aware approach to adversarial robustness in neural networks
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Considering the growing prominence of production-level AI and the threat of adversarial attacks that can evade a model at run-time, evaluating the robustness of models to these evasion attacks is of critical importance.Additionally, testing model changes likely means deploying the models to (e.g., a car or a medical imaging device), or a drone to see how it affects performance, making un-tested changes a public problem that reduces development speed, increases cost of development, and makes it difficult (if not impossible) to parse cause from effect.In this work, we used survival analysis as a cloud-native, time-efficient and precise method for predicting model performance in the presence of adversarial noise.For neural networks in particular, the relationships between the learning rate, batch size, training time, convergence time, and deployment cost are highly complex, so researchers generally rely on benchmark datasets to assess the ability of a model to generalize beyond the training data. However, in practice, this means that each model configuration needs to be evaluated against real-world deployment samples which can be prohibitively expensive or time-consuming to collect --- especially when other parts of the software or hardware stack are developed in parallel. To address this, we propose using accelerated failure time models to measure the effect of hardware choice, batch size, number of epochs, and test-set accuracy by using adversarial attacks to induce failures on a reference model architecture before deploying the model to the real world. We evaluate several GPU types and use the Tree Parzen Estimator to maximize model robustness and minimize model run-time simultaneously. This provides a way to evaluate the model and optimise it in a single step, while simultaneously allowing us to model the effect of model parameters on training time, prediction time, and accuracy. Using this technique, we demonstrate that newer, more-powerful hardware does decrease the training time, but with a monetary and power cost that far outpaces the marginal gains in accuracy.

Keywords
artificial intelligence, machine learning, adversarial AI, optimisation, compliance
National Category
Computer Sciences
Research subject
Computer Science; Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-238922 (URN)
Funder
Knut and Alice Wallenberg Foundation, 2019.0352
Available from: 2025-05-16 Created: 2025-05-16 Last updated: 2025-05-19Bibliographically approved
5. A tiny, client-side classifier
Open this publication in new window or tab >>A tiny, client-side classifier
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The recent developments in machine learning have highlighted a conflict between online platforms and their users in terms of privacy. The importance of user privacy and the struggle for power over user data has been intensified as regulators and operators attempt to police the online platforms.As users have become increasingly aware of privacy issues, client-side data storage, management, and analysis have become a favoured approach to large-scale centralised machine learning.However, state-of-the-art machine learning methods require vast amounts of labelled user data, making them unsuitable for models that reside client-side and only have access to a single user's data.State-of-the-art methods are also computationally expensive, which degrades the user experience on compute-limited hardware and also reduces battery life.A recent alternative approach has proven remarkably successful in classification tasks across a wide variety of data---using a compression-based distance measure (called normalized compression distance) to measure the distance between generic objects in classical distance-based machine learning methods.In this work, we demonstrate that the normalized compression distance is actually not a metric; develop it for the wider context of kernel methods to allow modelling of complex data; and present techniques to improve the training time of models that use this distance measure.We show that the normalised compression distance works as well as and sometimes better than other metrics and kernels---without incurring additional computational costs and in spite of the lack of formal metric properties.The end results is a simple model with remarkable accuracy even when trained on a very small number of samples allowing for models that are small and effective enough to run entirely on a client device using only user-supplied data.

Keywords
Text classification, kernel methods, spam detection, privacy, classification, compression
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-238924 (URN)
Available from: 2025-05-16 Created: 2025-05-16 Last updated: 2025-05-19Bibliographically approved
6. deckard: A declarative tool for machine learning robustness evaluation
Open this publication in new window or tab >>deckard: A declarative tool for machine learning robustness evaluation
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The software package presented, called `deckard`, is a modular software toolkit designed to streamline and standardize experimentation in machine learning (ML) with a particular focuse on the adversarial scenario. It provides a flexible, extensible framework for defining, executing, and analyzing end-to-end ML pipelines in the context of a malicious actor. As it is built on top of the Hydra configuration system, deckard supports declarative YAML-based configuration of data preprocessing, model training, and adversarial attack pipelines, enabling reproducible, framework-agnostic experimentation across diverse ML settings.

In addition to configuration management, `deckard` includes a suite of utilities for distributed and parallel execution, automated hyperparameter optimisation, visualisation, and result aggregation. The tooling abstracts away much of the engineering overhead typically involved in adversarial ML research, allowing researchers to focus on algorithmic insights rather than implementation details. The presented software facilitates rigorous benchmarking by maintaining an auditable trace of configurations, random seeds, and intermediate outputs throughout the experimental lifecycle.

The system is compatible with a variety of ML frameworks and several classes of adversarial attacks, making it a suitable backend for both large-scale automated testing and fine-grained empirical analysis. By providing a unified interface for experimental control, `deckard` accelerates the development and evaluation of robust models, and helps close the gap between research prototypes and verifiable, reproducible results.

Keywords
software, machine learning, robustness, reproducibility, compliance
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-238925 (URN)
Funder
Knut and Alice Wallenberg Foundation, 2019.0352eSSENCE - An eScience Collaboration
Available from: 2025-05-16 Created: 2025-05-16 Last updated: 2025-05-19Bibliographically approved

Open Access in DiVA

fulltext(2343 kB)46 downloads
File information
File name FULLTEXT04.pdfFile size 2343 kBChecksum SHA-512
2155e5884b50c62e34948c23ebd91e1031f9052f28ceddf2b30e0066a451fa03d082d579e887583d29a1b4a10d9354eb1f571e1dc75363f65da88ab3f104804a
Type fulltextMimetype application/pdf
spikblad(219 kB)28 downloads
File information
File name SPIKBLAD01.pdfFile size 219 kBChecksum SHA-512
defb05e78309b0c8fe6b10a2defa4e23b99cb557a8896962e369c3d5a048bdffa2e56be427de91136f81734376d65195ddfb75cd2efa97b0511e0d15f18a23bc
Type spikbladMimetype application/pdf

Authority records

Meyers, Charles

Search in DiVA

By author/editor
Meyers, Charles
By organisation
Department of Computing Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 179 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 290 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf