Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 91) Show all publications
Varshney, A. K. & Torra, V. (2025). Concept drift detection using ensemble of integrally private models. In: Rosa Meo; Fabrizio Silvestri (Ed.), Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part V. Paper presented at MLCS@ECML-PKDD 2023, The 5th Workshop on Machine Learning for CyberSecurity, Turin, Italy, September 18-22, 2023 (pp. 290-304). Springer
Open this publication in new window or tab >>Concept drift detection using ensemble of integrally private models
2025 (English)In: Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part V / [ed] Rosa Meo; Fabrizio Silvestri, Springer, 2025, p. 290-304Conference paper, Published paper (Refereed)
Abstract [en]

Deep neural networks (DNNs) are one of the most widely used machine learning algorithm. DNNs requires the training data to be available beforehand with true labels. This is not feasible for many real-world problems where data arrives in the streaming form and acquisition of true labels are scarce and expensive. In the literature, not much focus has been given to the privacy prospect of the streaming data, where data may change its distribution frequently. These concept drifts must be detected privately in order to avoid any disclosure risk from DNNs. Existing privacy models use concept drift detection schemes such ADWIN, KSWIN to detect the drifts. In this paper, we focus on the notion of integrally private DNNs to detect concept drifts. Integrally private DNNs are the models which recur frequently from different datasets. Based on this, we introduce an ensemble methodology which we call 'Integrally Private Drift Detection' (IPDD) method to detect concept drift from private models. Our IPDD method does not require labels to detect drift but assumes true labels are available once the drift has been detected. We have experimented with binary and multi-class synthetic and real-world data. Our experimental results show that our methodology can privately detect concept drift, has comparable utility (even better in some cases) with ADWIN and outperforms utility from different levels of differentially private models.

Place, publisher, year, edition, pages
Springer, 2025
Series
Communications in Computer and Information Science, ISSN 1865-0929, E-ISSN 1865-0937 ; 2137
Keywords
Data privacy, Integral privacy, Concept Drift, Private drift, Deep neural networks, Streaming data.
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-222796 (URN)10.1007/978-3-031-74643-7_22 (DOI)2-s2.0-85215978495 (Scopus ID)978-3-031-74643-7 (ISBN)978-3-031-74642-0 (ISBN)
Conference
MLCS@ECML-PKDD 2023, The 5th Workshop on Machine Learning for CyberSecurity, Turin, Italy, September 18-22, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-03-27 Created: 2024-03-27 Last updated: 2025-02-11Bibliographically approved
Torra, V. (2025). Differentially private Choquet integral: extending mean, median, and order statistics. International Journal of Information Security, 24(1), Article ID 68.
Open this publication in new window or tab >>Differentially private Choquet integral: extending mean, median, and order statistics
2025 (English)In: International Journal of Information Security, ISSN 1615-5262, E-ISSN 1615-5270, Vol. 24, no 1, article id 68Article in journal (Refereed) Published
Abstract [en]

The Choquet integral is a well known aggregation function that generalizes several other well known functions. For example, appropriate parameterizations reduce a Choquet integral to the arithmetic mean, the weighted mean, order statistics, and linear combination of order statistics. This integral has been used extensively in data fusion. We find applications in computer science, economy, and decision making. Formally, Choquet integrals integrate a function (the data to be aggregated) with respect to a non-additive measure also called a fuzzy measure (which represents the background knowledge on the information sources that provide the data to be aggregated). In this paper we propose a privacy preserving Choquet integral which satisfies differential privacy. Then, we study the sensitivity of the Choquet integral with respect to different types of fuzzy measures. Our results generalize previous knowledge about the sensitivity of minimum, maximum, and the arithmetic mean.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Choquet integral, Differential privacy, Information aggregation, Mean and median
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-236125 (URN)10.1007/s10207-025-00984-7 (DOI)001404827800001 ()2-s2.0-85218416161 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2023-05541
Available from: 2025-03-07 Created: 2025-03-07 Last updated: 2025-03-07Bibliographically approved
Varshney, A. K. & Torra, V. (2025). Efficient federated unlearning under plausible deniability. Machine Learning, 114(1), Article ID 25.
Open this publication in new window or tab >>Efficient federated unlearning under plausible deniability
2025 (English)In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 114, no 1, article id 25Article in journal (Refereed) Published
Abstract [en]

Privacy regulations like the GDPR in Europe and the CCPA in the US allow users the right to remove their data from machine learning (ML) applications. Machine unlearning addresses this by modifying the ML parameters in order to forget the influence of a specific data point on its weights. Recent literature has highlighted that the contribution from datapoint(s) can be forged with some other data points in the dataset with probability close to one. This allows a server to falsely claim unlearning without actually modifying the model’s parameters. However, in distributed paradigms such as federated learning (FL), where the server lacks access to the dataset and the number of clients are limited, claiming unlearning in such cases becomes a challenge. An honest server must modify the model parameters in order to unlearn. This paper introduces an efficient way to achieve machine unlearning in FL, i.e., federated unlearning, by employing a privacy model which allows the FL server to plausibly deny the client’s participation in the training up to a certain extent. Specifically, we demonstrate that the server can generate a Proof-of-Deniability, where each aggregated update can be associated with at least x (the plausible deniability parameter) client updates. This enables the server to plausibly deny a client’s participation. However, in the event of frequent unlearning requests, the server is required to adopt an unlearning strategy and, accordingly, update its model parameters. We also perturb the client updates in a cluster in order to avoid inference from an honest but curious server. We show that the global model satisfies (𝜖, 𝛿)-differential privacy after T number of communication rounds. The proposed methodology has been evaluated on multiple datasets indifferent privacy settings. The experimental results show that our framework achieves comparable utility while providing a significant reduction in terms of memory (≈ 30 times), as well as retraining time (1.6-500769 times). The source code for the paper is available https://github.com/Ayush-Umu/Federated-Unlearning-under-Plausible-Deniability

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Machine unlearning, Federated unlearning, FedAvg, Integral privacy, Plausible deniability, Differential privacy
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-234248 (URN)10.1007/s10994-024-06685-x (DOI)001400054000004 ()2-s2.0-85217772811 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-01-18 Created: 2025-01-18 Last updated: 2025-02-25Bibliographically approved
Taha, M. & Torra, V. (2025). Generalized F-spaces through the lens of fuzzy measures. Fuzzy sets and systems (Print), 507, Article ID 109317.
Open this publication in new window or tab >>Generalized F-spaces through the lens of fuzzy measures
2025 (English)In: Fuzzy sets and systems (Print), ISSN 0165-0114, E-ISSN 1872-6801, Vol. 507, article id 109317Article in journal (Refereed) Published
Abstract [en]

Probabilistic metric spaces are natural extensions of metric spaces, where the function that computes the distance outputs a distribution on the real numbers rather than a single value. Such a function is called a distribution function. F-spaces are constructions for probabilistic metric spaces, where the distribution functions are built for functions that map from a measurable space to a metric space. In this paper, we propose an extension of F-spaces, called Generalized F-space. This construction replaces the metric space with a probabilistic metric space and uses fuzzy measures to evaluate sets of elements whose distances are probability distributions. We present several results that establish connections between the properties of the constructed space and specific fuzzy measures under particular triangular norms. Furthermore, we demonstrate how the space can be applied in machine learning to compute distances between different classifier models. Experimental results based on Sugeno λ-measures are consistent with our theoretical findings.

Keywords
Fuzzy measures, Probabilistic metric space
National Category
Computer Sciences Computer Systems
Identifiers
urn:nbn:se:umu:diva-235860 (URN)10.1016/j.fss.2025.109317 (DOI)2-s2.0-85217744245 (Scopus ID)
Available from: 2025-02-24 Created: 2025-02-24 Last updated: 2025-02-24Bibliographically approved
Paul, S., Salas, J. & Torra, V. (2025). Improving locally differentially private graph statistics through sparseness-preserving noise-graph addition. In: Roberto Di Pietro; Karen Renaud; Paolo Mori (Ed.), Proceedings of the 11th International Conference on Information Systems Security and Privacy: Volume 2. Paper presented at 11th International Conference on Information Systems Security and Privacy, Porto, Portogual, February 20-22, 2025 (pp. 526-533). SciTePress, 2
Open this publication in new window or tab >>Improving locally differentially private graph statistics through sparseness-preserving noise-graph addition
2025 (English)In: Proceedings of the 11th International Conference on Information Systems Security and Privacy: Volume 2 / [ed] Roberto Di Pietro; Karen Renaud; Paolo Mori, SciTePress, 2025, Vol. 2, p. 526-533Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Differential privacy allows to publish graph statistics in a way that protects individual privacy while stillallowing meaningful insights to be derived from the data. The centralized privacy model of differential privacyassumes that there is a trusted data curator, while the local model does not require such a trusted authority.Local differential privacy is commonly achieved through randomized response (RR) mechanisms. This doesnot preserve the sparseness of the graphs. As most of the real-world graphs are sparse and have several nodes,this is a drawback of RR-based mechanisms, in terms of computational efficiency and accuracy. We thus,propose a comparative analysis through experimental analysis and discussion, to compute statistics with localdifferential privacy, where, it is shown that preserving the sparseness of the original graphs is the key factorto gain that balance between utility and privacy. We perform several experiments to test the utility of theprotected graphs in terms of several sub-graph counting i.e. triangle, and star counting and other statistics. Weshow that the sparseness preserving algorithm gives comparable or better results in comparison to the otherstate of the art methods and improves computational efficiency.

Place, publisher, year, edition, pages
SciTePress, 2025
Series
ICISSP, ISSN 2184-4356
Keywords
Privacy in Large Network, Differential Privacy, Edge Local Differential Privacy
National Category
Computer and Information Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-237718 (URN)10.5220/0013174400003899 (DOI)2-s2.0-105001734608 (Scopus ID)978-989-758-735-1 (ISBN)
Conference
11th International Conference on Information Systems Security and Privacy, Porto, Portogual, February 20-22, 2025
Projects
570011356
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP), 570011356
Available from: 2025-04-15 Created: 2025-04-15 Last updated: 2025-04-16Bibliographically approved
Varshney, A. K., Vandikas, K. & Torra, V. (2025). Unlearning clients, features and samples in vertical federated learning. Paper presented at The 25th Privacy Enhancing Technologies Symposium, Washington, USA and online, July 14-19, 2025. Proceedings on Privacy Enhancing Technologies, 2025(2), 39-53
Open this publication in new window or tab >>Unlearning clients, features and samples in vertical federated learning
2025 (English)In: Proceedings on Privacy Enhancing Technologies, E-ISSN 2299-0984, Vol. 2025, no 2, p. 39-53Article in journal (Refereed) Published
Abstract [en]

Federated Learning ( FL ) has emerged as a prominent distributed learning paradigm that allows multiple users to collaboratively train a model without sharing their data thus preserving privacy.Within the scope of privacy preservation, information privacy regulations such as GDPR entitle users to request the removal (or unlearning) of their contribution from a service that is hosting the model. For this purpose, a server hosting an ML model must be able to unlearn certain information in cases such as copyright infringement or security issues that can make the model vulnerable or impact the performance of a service based on that model. While most unlearning approaches in FL focus on Horizontal Federated Learning (HFL), where clients share the feature space and the global model, Vertical Federated Learning (VFL) has received less attention from the research community. VFL involves clients (passive parties) sharing the sample space among them while not having access to the labels. In this paper, we explore unlearning in VFL from three perspectives: unlearning passive parties, unlearning features, and unlearning samples. To unlearn passive parties and features we introduce VFU-KD which is based on knowledge distillation(KD) while to unlearn samples, VFU-GA is introduced which is based on gradient ascent (GA). To provide evidence of approximate unlearning, we utilize Membership Inference Attack (MIA) to audit the effectiveness of our unlearning approach. Our experiments across six tabular datasets and two image datasets demonstrate that VFU-KD and VFU-GA achieve performance comparable to or better than both retraining from scratch and the benchmark R2S method in many cases, with improvements of (0 − 2%). In the remaining cases, utility scores remain comparable, with a modest utility loss ranging from 1 − 5%. Unlike existing methods, VFU-KD and VFU-GA require no communication between active and passive parties during unlearning. However, they do require the active party to store the previously communicated embeddings.

Place, publisher, year, edition, pages
Privacy Enhancing Technologies Symposium Advisory Board, 2025
Keywords
Federated learning; Unlearning; Vertical federated learning; Auditing; \ac{MIA}
National Category
Security, Privacy and Cryptography
Identifiers
urn:nbn:se:umu:diva-237429 (URN)10.56553/popets-2025-0048 (DOI)
Conference
The 25th Privacy Enhancing Technologies Symposium, Washington, USA and online, July 14-19, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-04-09 Created: 2025-04-09 Last updated: 2025-04-09Bibliographically approved
Negi, S. S. & Torra, V. (2024). A note on Sugeno exponential function with respect to distortion. Applied Mathematics and Computation, 470, Article ID 128586.
Open this publication in new window or tab >>A note on Sugeno exponential function with respect to distortion
2024 (English)In: Applied Mathematics and Computation, ISSN 0096-3003, E-ISSN 1873-5649, Vol. 470, article id 128586Article in journal (Refereed) Published
Abstract [en]

This study explores the Sugeno exponential function, which is the solution to a first order differential equation with respect to nonadditive measures, specifically distorted Lebesgue measures. We define k-distorted semigroup property of the Sugeno exponential function, introduce a new addition operation on a set of distortion functions, and discuss some related results. Furthermore, m-Bernoulli inequality, a more general inequality than the well-known Bernoulli inequality on the real line, is established for the Sugeno exponential function. Additionally, the above concept is extended to a system of differential equations with respect to the distorted Lebesgue measure which gives rise to the study of a matrix m-exponential function.

Finally, we present an appropriate m-distorted logarithm function and describe its behavior when applied to various functions, such as the sum, product, quotient, etc., while maintaining basic algebraic structures. The results are illustrated throughout the paper with a variety of examples.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Banach space, Bernoulli inequality, Choquet integral, Nonadditive measure, Semigroup theory, Time scale theory
National Category
Mathematical Analysis
Identifiers
urn:nbn:se:umu:diva-221028 (URN)10.1016/j.amc.2024.128586 (DOI)2-s2.0-85184150547 (Scopus ID)
Funder
The Kempe FoundationsWallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-03-06 Created: 2024-03-06 Last updated: 2024-03-06Bibliographically approved
Navarro-Arribas, G. & Torra, V. (2024). Attribute disclosure risk in smart meter data. In: Privacy in statistical databases: International conference, PSD 2024, Antibes Juan-les-Pins, France, September 25–27, 2024, proceedings. Paper presented at International Conference on Privacy in Statistical Databases, PSD 2024, Antibes Juan-les-Pins, France, September 25-27, 2024 (pp. 274-283). Springer Nature
Open this publication in new window or tab >>Attribute disclosure risk in smart meter data
2024 (English)In: Privacy in statistical databases: International conference, PSD 2024, Antibes Juan-les-Pins, France, September 25–27, 2024, proceedings, Springer Nature, 2024, p. 274-283Conference paper, Published paper (Refereed)
Abstract [en]

This paper studies attribute disclosure risk in aggregated smart meter data. Smart meter data is commonly aggregated to preserve the privacy of individual contributions. The published data shows aggregated consumption, preventing the revelation of individual consumption patterns. There is, however, a potential risk associated to aggregated data. We analyze some datasets of smart meter data consumption to show the potential risk of attribute disclosure. We observe that, even if data is aggregated with the most favorable aggregation approach, it presents this attribute disclosure risk.

Place, publisher, year, edition, pages
Springer Nature, 2024
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14915
Keywords
Attribute disclosure, k-anonymity, Smart meter data
National Category
Computer Sciences Information Systems, Social aspects
Identifiers
urn:nbn:se:umu:diva-230583 (URN)10.1007/978-3-031-69651-0_18 (DOI)2-s2.0-85205106629 (Scopus ID)978-3-031-69650-3 (ISBN)978-3-031-69651-0 (ISBN)
Conference
International Conference on Privacy in Statistical Databases, PSD 2024, Antibes Juan-les-Pins, France, September 25-27, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2022-04645
Available from: 2024-10-09 Created: 2024-10-09 Last updated: 2024-10-09Bibliographically approved
Garg, S. & Torra, V. (2024). Can synthetic data preserve manifold properties?. In: Nikolaos Pitropakis; Sokratis Katsikas; Steven Furnell; Konstantinos Markantonakis (Ed.), ICT systems security and privacy protection: 39th IFIP international conference, SEC 2024, Edinburgh, UK, June 12–14, 2024, proceedings. Paper presented at 39th IFIP International Conference on ICT Systems Security and Privacy Protection, SEC 2024 (pp. 134-147). Paper presented at 39th IFIP International Conference on ICT Systems Security and Privacy Protection, SEC 2024. Cham: Springer
Open this publication in new window or tab >>Can synthetic data preserve manifold properties?
2024 (English)In: ICT systems security and privacy protection: 39th IFIP international conference, SEC 2024, Edinburgh, UK, June 12–14, 2024, proceedings / [ed] Nikolaos Pitropakis; Sokratis Katsikas; Steven Furnell; Konstantinos Markantonakis, Cham: Springer, 2024, p. 134-147Chapter in book (Refereed)
Abstract [en]

Machine learning has shown remarkable performance in modeling large datasets with complex patterns. As the amount of data increases, it often leads to high-dimensional feature spaces. This data may contain confidential information that must be safeguarded against disclosure. One way to make the data accessible could be by using anonymization. An alternative is to use synthetic data that mimics the behavior of the original data. GANs represent a prominent approach for generating synthetic samples that faithfully replicate the distributional characteristics of the original data. In scenarios involving high-dimensional data, preserving the geometric properties, structural integrity, and relative positioning of data points is paramount, as neglecting such information may compromise utility. This research aims to investigate the manifold properties of synthetically generated data and introduces a novel framework for producing privacy-preserving synthetic data while upholding the manifold structure of the original data. While existing studies predominantly focus on privacy preservation within GANs, the critical aspect of preserving the manifold structure of data remains unaddressed. Our novel approach adeptly addresses both privacy concerns and manifold structure preservation, distinguishing it from prior research endeavors. Comparative assessments against baseline models are conducted using metrics such as Maximum Mean Discrepancy (MMD), Fréchet Inception Distance (FID), and F1-score. Additionally, the privacy risk posed by the models is evaluated through data reconstruction attacks. Results demonstrate that the proposed framework exhibits diminished vulnerability to privacy breaches while more effectively preserving the intrinsic structure of the data.

Place, publisher, year, edition, pages
Cham: Springer, 2024
Series
IFIP Advances in Information and Communication Technology, ISSN 1868-4238, E-ISSN 1868-422X ; 710
Keywords
Generative Adversarial Network, k-Anonymity, Manifold Learning, Synthetic Data
National Category
Computational Mathematics Computer Sciences
Identifiers
urn:nbn:se:umu:diva-228477 (URN)10.1007/978-3-031-65175-5_10 (DOI)2-s2.0-85200774719 (Scopus ID)9783031651748 (ISBN)9783031651779 (ISBN)9783031651755 (ISBN)
Conference
39th IFIP International Conference on ICT Systems Security and Privacy Protection, SEC 2024
Note

Revised papers from the 39th IFIP International Conference on ICT Systems Security and Privacy Protection, SEC 2024, Edinburgh, UK, June 12-14, 2024.

Available from: 2024-08-15 Created: 2024-08-15 Last updated: 2024-08-15Bibliographically approved
Ontkovičová, Z. & Torra, V. (2024). Computation of Choquet integrals: Analytical approach for continuous functions. Information Sciences, 679, Article ID 121105.
Open this publication in new window or tab >>Computation of Choquet integrals: Analytical approach for continuous functions
2024 (English)In: Information Sciences, ISSN 0020-0255, E-ISSN 1872-6291, Vol. 679, article id 121105Article in journal (Refereed) Published
Abstract [en]

In the continuous case, analytical computations of the Choquet integral are limited, despite being commonly used in various applications. One can either use the definition, which is computationally demanding and impractical, or apply already existing formulas restricted only to monotone nonnegative functions on a real interval starting at zero. This article aims to present more convenient computational formulas for continuous functions without imposing restrictions on their monotonicity given any real interval. First, a more general approach to monotone functions is provided for both positive and negative functions. Then, reordering techniques are introduced to compute the Choquet integral of an arbitrary continuous function, and with these, a monotone equivalent to every function can be constructed. This equivalent function preserves the final Choquet integral value, implying that only formulas for monotone functions are required. In addition to general fuzzy measures, the article assumes particular cases of distorted Lebesgue measures and distorted probabilities as the most commonly used fuzzy measures.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Choquet integrals, Computational formulas, Distorted Lebesgue measures, Distorted probabilities, Reordering techniques
National Category
Mathematical Analysis Other Mathematics
Research subject
Mathematics
Identifiers
urn:nbn:se:umu:diva-227701 (URN)10.1016/j.ins.2024.121105 (DOI)2-s2.0-85197098440 (Scopus ID)
Funder
Umeå UniversityThe Kempe FoundationsKnut and Alice Wallenberg FoundationWallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-07-04 Created: 2024-07-04 Last updated: 2024-07-09Bibliographically approved
Projects
Disclosure risk and transparency in big data privacy [2016-03346_VR]; University of Skövde
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-0368-8037

Search in DiVA

Show all publications