Umeå University's logo

umu.sePublications
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
PrunePrivyTune: accelerating language models with pruning and differentially private fine-tuning
Umeå University, Faculty of Science and Technology, Department of Computing Science. South Asian University, Delhi, India.ORCID iD: 0000-0002-7204-8228
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-0368-8037
2026 (English)In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 115, no 5, article id 108Article in journal (Refereed) Published
Abstract [en]

Large Language Models (LLMs) have demonstrated exceptional capabilities in language understanding and generation, but their large-scale architecture poses significant challenges in deployment and inference, such as increased computational demands and slower processing times. While various techniques like model pruning, knowledge distillation, and quantization have been developed to compress LLMs, they often result in task-specific compression, limiting the model's versatility. Additionally, LLMs face privacy risks due to their potential to memorize and reproduce sensitive training data, raising concerns when deployed in real-world applications. To address these challenges, we propose a novel methodology PrunePrivyTune that combines efficient model compression with privacy preserving fine-tuning. Our approach leverages pairwise cosine similarity to identify redundant layers in transformer models, enabling structural pruning that reduces model size without compromising performance. After pruning, we apply Low-Rank Adaptation (LoRA) with DPSGD to fine-tune the model. This ensures that fine-tuning process is both efficient and privacy-preserving, outperforming training and preventing the model from memorizing sensitive data. Later on, we generated synthetic data using the fine-tuned model and subsequently conducted a training data extraction attack to assess the model’s privacy vulnerabilities, in terms of perplexity and BERTScore. Our framework demonstrates that the proposed methodology effectively reduces the inference time through model compression and pruning compliments privacy, followed by private fine-tuning. Additionally, our privacy risk assessment indicates that integrating DP successfully mitigates the risk of the model's memorization. This approach upholds strong privacy guarantees, making it highly suitable for real-time applications and deployment in sensitive domains where data confidentiality is paramount.

Place, publisher, year, edition, pages
Springer Nature, 2026. Vol. 115, no 5, article id 108
Keywords [en]
Differential privacy, Low-rank adaptation, Model compression, Pruning, Training data extraction attack
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-253740DOI: 10.1007/s10994-026-07016-yISI: 001753498000001Scopus ID: 2-s2.0-105037992367OAI: oai:DiVA.org:umu-253740DiVA, id: diva2:2064261
Funder
Umeå UniversityAvailable from: 2026-06-01 Created: 2026-06-01 Last updated: 2026-06-01Bibliographically approved

Open Access in DiVA

fulltext(1761 kB)33 downloads
File information
File name FULLTEXT01.pdfFile size 1761 kBChecksum SHA-512
2960d45389284cf52f95d66c6811c75d14352d342bc7ca0d3c6416bdd55b95e7b6034d5ed9873465db347d02890cf55c4e5bea714eaffa629fd91b9716a88d78
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Garg, SonakshiTorra, Vicenç

Search in DiVA

By author/editor
Garg, SonakshiTorra, Vicenç
By organisation
Department of Computing Science
In the same journal
Machine Learning
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 94 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf