Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 95) Show all publications
Paul, S., Sadjadi, F., Torra, V. & Salas, J. (2025). Community discovery on dynamic graphs with edge local differential privacy. Complex Systems
Open this publication in new window or tab >>Community discovery on dynamic graphs with edge local differential privacy
2025 (English)In: Complex Systems, ISSN 0891-2513Article in journal (Refereed) In press
Abstract [en]

Interactions among different elements of complex networks are organized in a structured manner. The collective behavior of the elements of these networks is organized according to community structure. Several methods have been defined to automatically detect these substructures in the field known as community discovery. Most of the methods have been applied to static or aggregated data. Recently the identification of evolving communities has gained more attention. Studying the relations among individuals yields insights on how communities form and evolve, but there are some limits that should be enforced to respect individuals’ privacy while sharing and collecting their data. Privacy-protection techniques have been commonly applied to static data, while there are few methods that work on dynamic data. Recently, there have been some approaches to protect dynamic graphs with local edge-differential privacy that have been tested for community discovery applications. However, the evolution of the communities over time has not been evaluated on the privacy-protected data. We test the utility considering community discovery and evolution in time-varying networks for such localedge-ϵ-differential privacy methods. We show empirically how these algorithms can provide privacy while preserving the community lifecycles, for their privacy-aware study.

Place, publisher, year, edition, pages
Complex Systems Publications, 2025
Keywords
edge local differential privacy, dynamic graphs, community discovery
National Category
Security, Privacy and Cryptography
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-239142 (URN)10.25088/ComplexSystems.34.2.000 (DOI)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP), 570011356
Available from: 2025-05-23 Created: 2025-05-23 Last updated: 2025-05-26
Varshney, A. K. & Torra, V. (2025). Concept drift detection using ensemble of integrally private models. In: Rosa Meo; Fabrizio Silvestri (Ed.), Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part V. Paper presented at MLCS@ECML-PKDD 2023, The 5th Workshop on Machine Learning for CyberSecurity, Turin, Italy, September 18-22, 2023 (pp. 290-304). Springer
Open this publication in new window or tab >>Concept drift detection using ensemble of integrally private models
2025 (English)In: Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part V / [ed] Rosa Meo; Fabrizio Silvestri, Springer, 2025, p. 290-304Conference paper, Published paper (Refereed)
Abstract [en]

Deep neural networks (DNNs) are one of the most widely used machine learning algorithm. DNNs requires the training data to be available beforehand with true labels. This is not feasible for many real-world problems where data arrives in the streaming form and acquisition of true labels are scarce and expensive. In the literature, not much focus has been given to the privacy prospect of the streaming data, where data may change its distribution frequently. These concept drifts must be detected privately in order to avoid any disclosure risk from DNNs. Existing privacy models use concept drift detection schemes such ADWIN, KSWIN to detect the drifts. In this paper, we focus on the notion of integrally private DNNs to detect concept drifts. Integrally private DNNs are the models which recur frequently from different datasets. Based on this, we introduce an ensemble methodology which we call 'Integrally Private Drift Detection' (IPDD) method to detect concept drift from private models. Our IPDD method does not require labels to detect drift but assumes true labels are available once the drift has been detected. We have experimented with binary and multi-class synthetic and real-world data. Our experimental results show that our methodology can privately detect concept drift, has comparable utility (even better in some cases) with ADWIN and outperforms utility from different levels of differentially private models.

Place, publisher, year, edition, pages
Springer, 2025
Series
Communications in Computer and Information Science, ISSN 1865-0929, E-ISSN 1865-0937 ; 2137
Keywords
Data privacy, Integral privacy, Concept Drift, Private drift, Deep neural networks, Streaming data.
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-222796 (URN)10.1007/978-3-031-74643-7_22 (DOI)2-s2.0-85215978495 (Scopus ID)978-3-031-74643-7 (ISBN)978-3-031-74642-0 (ISBN)
Conference
MLCS@ECML-PKDD 2023, The 5th Workshop on Machine Learning for CyberSecurity, Turin, Italy, September 18-22, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-03-27 Created: 2024-03-27 Last updated: 2025-04-28Bibliographically approved
Torra, V. (2025). Differentially private Choquet integral: extending mean, median, and order statistics. International Journal of Information Security, 24(1), Article ID 68.
Open this publication in new window or tab >>Differentially private Choquet integral: extending mean, median, and order statistics
2025 (English)In: International Journal of Information Security, ISSN 1615-5262, E-ISSN 1615-5270, Vol. 24, no 1, article id 68Article in journal (Refereed) Published
Abstract [en]

The Choquet integral is a well known aggregation function that generalizes several other well known functions. For example, appropriate parameterizations reduce a Choquet integral to the arithmetic mean, the weighted mean, order statistics, and linear combination of order statistics. This integral has been used extensively in data fusion. We find applications in computer science, economy, and decision making. Formally, Choquet integrals integrate a function (the data to be aggregated) with respect to a non-additive measure also called a fuzzy measure (which represents the background knowledge on the information sources that provide the data to be aggregated). In this paper we propose a privacy preserving Choquet integral which satisfies differential privacy. Then, we study the sensitivity of the Choquet integral with respect to different types of fuzzy measures. Our results generalize previous knowledge about the sensitivity of minimum, maximum, and the arithmetic mean.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Choquet integral, Differential privacy, Information aggregation, Mean and median
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-236125 (URN)10.1007/s10207-025-00984-7 (DOI)001404827800001 ()2-s2.0-85218416161 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2023-05541
Available from: 2025-03-07 Created: 2025-03-07 Last updated: 2025-03-07Bibliographically approved
Varshney, A. K. & Torra, V. (2025). Efficient federated unlearning under plausible deniability. Machine Learning, 114(1), Article ID 25.
Open this publication in new window or tab >>Efficient federated unlearning under plausible deniability
2025 (English)In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 114, no 1, article id 25Article in journal (Refereed) Published
Abstract [en]

Privacy regulations like the GDPR in Europe and the CCPA in the US allow users the right to remove their data from machine learning (ML) applications. Machine unlearning addresses this by modifying the ML parameters in order to forget the influence of a specific data point on its weights. Recent literature has highlighted that the contribution from datapoint(s) can be forged with some other data points in the dataset with probability close to one. This allows a server to falsely claim unlearning without actually modifying the model’s parameters. However, in distributed paradigms such as federated learning (FL), where the server lacks access to the dataset and the number of clients are limited, claiming unlearning in such cases becomes a challenge. An honest server must modify the model parameters in order to unlearn. This paper introduces an efficient way to achieve machine unlearning in FL, i.e., federated unlearning, by employing a privacy model which allows the FL server to plausibly deny the client’s participation in the training up to a certain extent. Specifically, we demonstrate that the server can generate a Proof-of-Deniability, where each aggregated update can be associated with at least x (the plausible deniability parameter) client updates. This enables the server to plausibly deny a client’s participation. However, in the event of frequent unlearning requests, the server is required to adopt an unlearning strategy and, accordingly, update its model parameters. We also perturb the client updates in a cluster in order to avoid inference from an honest but curious server. We show that the global model satisfies (𝜖, 𝛿)-differential privacy after T number of communication rounds. The proposed methodology has been evaluated on multiple datasets indifferent privacy settings. The experimental results show that our framework achieves comparable utility while providing a significant reduction in terms of memory (≈ 30 times), as well as retraining time (1.6-500769 times). The source code for the paper is available https://github.com/Ayush-Umu/Federated-Unlearning-under-Plausible-Deniability

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Machine unlearning, Federated unlearning, FedAvg, Integral privacy, Plausible deniability, Differential privacy
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-234248 (URN)10.1007/s10994-024-06685-x (DOI)001400054000004 ()2-s2.0-85217772811 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-01-18 Created: 2025-01-18 Last updated: 2025-02-25Bibliographically approved
Garg, S. & Torra, V. (2025). Exploring distribution learning of synthetic data generators. In: Joaquin Garcia-Alfaro; Ken Barker; Guillermo Navarro-Arribas; Cristina Pérez-Solà; Sergi Delgado-Segura; Sokratis Katsikas; Frédéric Cuppens; Costas Lambrinoudakis; Nora Cuppens-Boulahia; Marek Pawlicki; Michał Choraś (Ed.), Joaquin Garcia-Alfaro; Ken Barker; Guillermo Navarro-Arribas; Cristina Pérez-Solà; Sergi Delgado-Segura; Sokratis Katsikas; Frédéric Cuppens; Costas Lambrinoudakis; Nora Cuppens-Boulahia; Marek Pawlicki; Michał Choraś (Ed.), Computer Security. ESORICS 2024 International Workshops: . Paper presented at ESORICS2024: 29th European Symposium on Research in Computer Security, Bydgoszcz, Poland, September 16–20, 2024 (pp. 65-76). Paper presented at ESORICS2024: 29th European Symposium on Research in Computer Security, Bydgoszcz, Poland, September 16–20, 2024. Springer Nature, 15263
Open this publication in new window or tab >>Exploring distribution learning of synthetic data generators
2025 (English)In: Computer Security. ESORICS 2024 International Workshops / [ed] Joaquin Garcia-Alfaro; Ken Barker; Guillermo Navarro-Arribas; Cristina Pérez-Solà; Sergi Delgado-Segura; Sokratis Katsikas; Frédéric Cuppens; Costas Lambrinoudakis; Nora Cuppens-Boulahia; Marek Pawlicki; Michał Choraś, Springer Nature, 2025, Vol. 15263, p. 65-76Chapter in book (Refereed)
Abstract [en]

In the era of data protection regulations like GDPR, safeguarding sensitive information has become paramount, prompting the exploration of synthetic data generation as a privacy-preserving alternative. Generative Adversarial Networks (GAN) and Variational Autoencoders (VAE), among other tools, have become popular for synthetic data generation. Despite their effectiveness, these models often carry the perception of being black boxes due to their complex learning mechanisms. Understanding the intricate behaviors of data within GAN or VAE poses a significant challenge, particularly with high-dimensional datasets. This is essential from privacy perspective as one can use synthetic data instead of original data and this can be considered as an alternative to anonymization. Our study aims to assess the distribution learning capabilities of synthetic data generators. Our methodology centers on artificially created datasets, such as swish roll and S-curve distributions, which offer easy visualization in R space. Additionally, we evaluate point datasets containing discontinuous points to determine whether GAN and VAE comprehend the discontinuity behavior of datasets. By evaluating the data processed by GAN and VAE, we aim to reveal their learning capabilities and disentangle the complexities of synthetic data generation. Our research shifts the focus from real-world image datasets to artificially generated datasets, enabling exploration of commonly encountered distributions in low-dimensional spaces. Despite widespread recognition of GAN in image synthesis, achieving satisfactory results often requires employing numerous tricks due to training instability. We found that VAE exhibit a superior understanding of the underlying distribution of points in R space compared to GAN. This inclination towards VAE arises from their more stable training process, inherent ability to capture latent structures within the data, and faster convergence compared to GAN.

Place, publisher, year, edition, pages
Springer Nature, 2025
Series
Lecture Notes in Computer Science ; 15263
Keywords
Manifold Learning, Privacy, Synthetic Data Generators, Generative Adversarial Networks, Variational Autoencoder
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-238534 (URN)10.1007/978-3-031-82349-7_5 (DOI)2-s2.0-105002461320 (Scopus ID)
Conference
ESORICS2024: 29th European Symposium on Research in Computer Security, Bydgoszcz, Poland, September 16–20, 2024
Note

Included in the following conference series:

European Symposium on Research in Computer Security

Available from: 2025-05-07 Created: 2025-05-07 Last updated: 2025-05-08Bibliographically approved
Taha, M. & Torra, V. (2025). Generalized F-spaces through the lens of fuzzy measures. Fuzzy sets and systems (Print), 507, Article ID 109317.
Open this publication in new window or tab >>Generalized F-spaces through the lens of fuzzy measures
2025 (English)In: Fuzzy sets and systems (Print), ISSN 0165-0114, E-ISSN 1872-6801, Vol. 507, article id 109317Article in journal (Refereed) Published
Abstract [en]

Probabilistic metric spaces are natural extensions of metric spaces, where the function that computes the distance outputs a distribution on the real numbers rather than a single value. Such a function is called a distribution function. F-spaces are constructions for probabilistic metric spaces, where the distribution functions are built for functions that map from a measurable space to a metric space. In this paper, we propose an extension of F-spaces, called Generalized F-space. This construction replaces the metric space with a probabilistic metric space and uses fuzzy measures to evaluate sets of elements whose distances are probability distributions. We present several results that establish connections between the properties of the constructed space and specific fuzzy measures under particular triangular norms. Furthermore, we demonstrate how the space can be applied in machine learning to compute distances between different classifier models. Experimental results based on Sugeno λ-measures are consistent with our theoretical findings.

Keywords
Fuzzy measures, Probabilistic metric space
National Category
Computer Sciences Computer Systems
Identifiers
urn:nbn:se:umu:diva-235860 (URN)10.1016/j.fss.2025.109317 (DOI)001428707700001 ()2-s2.0-85217744245 (Scopus ID)
Available from: 2025-02-24 Created: 2025-02-24 Last updated: 2025-04-30Bibliographically approved
Paul, S., Salas, J. & Torra, V. (2025). Improving locally differentially private graph statistics through sparseness-preserving noise-graph addition. In: Roberto Di Pietro; Karen Renaud; Paolo Mori (Ed.), Proceedings of the 11th International Conference on Information Systems Security and Privacy: Volume 2. Paper presented at 11th International Conference on Information Systems Security and Privacy, Porto, Portogual, February 20-22, 2025 (pp. 526-533). SciTePress, 2
Open this publication in new window or tab >>Improving locally differentially private graph statistics through sparseness-preserving noise-graph addition
2025 (English)In: Proceedings of the 11th International Conference on Information Systems Security and Privacy: Volume 2 / [ed] Roberto Di Pietro; Karen Renaud; Paolo Mori, SciTePress, 2025, Vol. 2, p. 526-533Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Differential privacy allows to publish graph statistics in a way that protects individual privacy while stillallowing meaningful insights to be derived from the data. The centralized privacy model of differential privacyassumes that there is a trusted data curator, while the local model does not require such a trusted authority.Local differential privacy is commonly achieved through randomized response (RR) mechanisms. This doesnot preserve the sparseness of the graphs. As most of the real-world graphs are sparse and have several nodes,this is a drawback of RR-based mechanisms, in terms of computational efficiency and accuracy. We thus,propose a comparative analysis through experimental analysis and discussion, to compute statistics with localdifferential privacy, where, it is shown that preserving the sparseness of the original graphs is the key factorto gain that balance between utility and privacy. We perform several experiments to test the utility of theprotected graphs in terms of several sub-graph counting i.e. triangle, and star counting and other statistics. Weshow that the sparseness preserving algorithm gives comparable or better results in comparison to the otherstate of the art methods and improves computational efficiency.

Place, publisher, year, edition, pages
SciTePress, 2025
Series
ICISSP, ISSN 2184-4356
Keywords
Privacy in Large Network, Differential Privacy, Edge Local Differential Privacy
National Category
Computer and Information Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-237718 (URN)10.5220/0013174400003899 (DOI)2-s2.0-105001734608 (Scopus ID)978-989-758-735-1 (ISBN)
Conference
11th International Conference on Information Systems Security and Privacy, Porto, Portogual, February 20-22, 2025
Projects
570011356
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP), 570011356
Available from: 2025-04-15 Created: 2025-04-15 Last updated: 2025-04-16Bibliographically approved
Ontkovičová, Z. & Torra, V. (2025). On measures resulting from the choquet integration. In: Marie-Jeanne Lesot; Susana Vieira; Marek Z. Reformat; João Paulo Carvalho; Fernando Batista; Bernadette Bouchon-Meunier; Ronald R. Yager (Ed.), Information processing and management of uncertainty in knowledge-based systems: 20th international conference, IPMU 2024, Lisbon, Portugal, July 22-26, 2024, proceedings, volume 2. Paper presented at 20th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems. Lisbon, Portugal 2024 (pp. 3-11). Cham: Springer
Open this publication in new window or tab >>On measures resulting from the choquet integration
2025 (English)In: Information processing and management of uncertainty in knowledge-based systems: 20th international conference, IPMU 2024, Lisbon, Portugal, July 22-26, 2024, proceedings, volume 2 / [ed] Marie-Jeanne Lesot; Susana Vieira; Marek Z. Reformat; João Paulo Carvalho; Fernando Batista; Bernadette Bouchon-Meunier; Ronald R. Yager, Cham: Springer, 2025, p. 3-11Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we study measures arising as a result of the Choquet integration with respect to a particular class of measures. The initial insight is provided with several classes of additive and fuzzy measures. It can be seen that some classes are closed regarding the integration, e.g. probabilities, while some are not, such as distorted Lebesgue measures. Knowing both an integration measure and a resulting measure after the Choquet integration directly leads to a fuzzy analogue of the Radon-Nikodym derivatives. For them, a completely different possible approach to their existence is presented for a specific pair of measures.

Place, publisher, year, edition, pages
Cham: Springer, 2025
Series
Lecture Notes in Networks and Systems, ISSN 2367-3370, E-ISSN 2367-3389 ; 1175
National Category
Mathematical Analysis
Identifiers
urn:nbn:se:umu:diva-238764 (URN)10.1007/978-3-031-74000-8_1 (DOI)2-s2.0-105004652794 (Scopus ID)978-3-031-73999-6 (ISBN)978-3-031-74000-8 (ISBN)
Conference
20th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems. Lisbon, Portugal 2024
Available from: 2025-05-13 Created: 2025-05-13 Last updated: 2025-05-19Bibliographically approved
Nicolau, A. T., Parra-Arnau, J., Forne, J. & Torra, V. (2025). Uncoordinated syntactic privacy: a new composable metric for multiple, independent data publishing. IEEE Transactions on Information Forensics and Security, 20, 3362-3373
Open this publication in new window or tab >>Uncoordinated syntactic privacy: a new composable metric for multiple, independent data publishing
2025 (English)In: IEEE Transactions on Information Forensics and Security, ISSN 1556-6013, E-ISSN 1556-6021, Vol. 20, p. 3362-3373Article in journal (Refereed) Published
Abstract [en]

A privacy model is a privacy condition, dependent on a parameter, that guarantees an upper bound on the risk of reidentification disclosure and maybe also on the risk of attribute disclosure by an adversary. A privacy model is composable if the privacy guarantees of the model are preserved, possibly to a limited extent, after repeated independent application of the privacy model. From the opposite perspective, a privacy model is not composable if multiple independent data releases, each of them satisfying the requirements of the privacy model, may result in a privacy breach. Current privacy models are broadly classified into syntactic ones (such as k-anonymity and l-diversity) and semantic ones, which essentially refer to E-differential privacy (e-DP) and variations thereof. While e-DP and its variants offer strong composability properties, syntactic notions are not composable unless data releases are conducted by a single, centralized data holder that uses specialized notions such as m-invariance and τ -safety. In this work, we propose m-uncoordinated-syntactic-privacy (m-USP), the first syntactic notion with composability properties for the independent publication of nondisjoint data, in other words, without a centralized data holder. Theoretical results are formally proven, and experimental results demonstrate that the risk to individuals does not increase significantly, in contrast to non-composable methods, that are susceptible to attribute disclosure. In most cases, the utility degradation caused by the extra protection is less than 5% and decreases as the value of m increases.

Place, publisher, year, edition, pages
IEEE, 2025
Keywords
composability property, Data privacy, privacy model, syntactic privacy
National Category
Computer Sciences Computer Systems
Identifiers
urn:nbn:se:umu:diva-237587 (URN)10.1109/TIFS.2025.3551645 (DOI)001455443300004 ()2-s2.0-105001938533 (Scopus ID)
Available from: 2025-04-24 Created: 2025-04-24 Last updated: 2025-04-24Bibliographically approved
Varshney, A. K., Vandikas, K. & Torra, V. (2025). Unlearning clients, features and samples in vertical federated learning. Paper presented at The 25th Privacy Enhancing Technologies Symposium, Washington, USA and online, July 14-19, 2025. Proceedings on Privacy Enhancing Technologies, 2025(2), 39-53
Open this publication in new window or tab >>Unlearning clients, features and samples in vertical federated learning
2025 (English)In: Proceedings on Privacy Enhancing Technologies, E-ISSN 2299-0984, Vol. 2025, no 2, p. 39-53Article in journal (Refereed) Published
Abstract [en]

Federated Learning ( FL ) has emerged as a prominent distributed learning paradigm that allows multiple users to collaboratively train a model without sharing their data thus preserving privacy.Within the scope of privacy preservation, information privacy regulations such as GDPR entitle users to request the removal (or unlearning) of their contribution from a service that is hosting the model. For this purpose, a server hosting an ML model must be able to unlearn certain information in cases such as copyright infringement or security issues that can make the model vulnerable or impact the performance of a service based on that model. While most unlearning approaches in FL focus on Horizontal Federated Learning (HFL), where clients share the feature space and the global model, Vertical Federated Learning (VFL) has received less attention from the research community. VFL involves clients (passive parties) sharing the sample space among them while not having access to the labels. In this paper, we explore unlearning in VFL from three perspectives: unlearning passive parties, unlearning features, and unlearning samples. To unlearn passive parties and features we introduce VFU-KD which is based on knowledge distillation(KD) while to unlearn samples, VFU-GA is introduced which is based on gradient ascent (GA). To provide evidence of approximate unlearning, we utilize Membership Inference Attack (MIA) to audit the effectiveness of our unlearning approach. Our experiments across six tabular datasets and two image datasets demonstrate that VFU-KD and VFU-GA achieve performance comparable to or better than both retraining from scratch and the benchmark R2S method in many cases, with improvements of (0 − 2%). In the remaining cases, utility scores remain comparable, with a modest utility loss ranging from 1 − 5%. Unlike existing methods, VFU-KD and VFU-GA require no communication between active and passive parties during unlearning. However, they do require the active party to store the previously communicated embeddings.

Place, publisher, year, edition, pages
Privacy Enhancing Technologies Symposium Advisory Board, 2025
Keywords
Federated learning; Unlearning; Vertical federated learning; Auditing; \ac{MIA}
National Category
Security, Privacy and Cryptography
Identifiers
urn:nbn:se:umu:diva-237429 (URN)10.56553/popets-2025-0048 (DOI)
Conference
The 25th Privacy Enhancing Technologies Symposium, Washington, USA and online, July 14-19, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-04-09 Created: 2025-04-09 Last updated: 2025-04-28Bibliographically approved
Projects
Disclosure risk and transparency in big data privacy [2016-03346_VR]; University of Skövde
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-0368-8037

Search in DiVA

Show all publications