umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Personality-based Knowledge Extraction for Privacy-preserving Data Analysis
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Database and Data Mining Group)ORCID iD: 0000-0001-8820-2405
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Database and Data Mining Group)
Umeå University, Faculty of Social Sciences, Centre for Demographic and Ageing Research (CEDAR).
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2017 (English)In: K-CAP 2017 - Proceedings of the Knowledge Capture Conference, Austin, TX, USA: ACM Digital Library, 2017, article id 45Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we present a differential privacy preserving approach, which extracts personality-based knowledge to serve privacy guarantee data analysis on personal sensitive data. Based on the approach, we further implement an end-to-end privacy guarantee system, KaPPA, to provide researchers iterative data analysis on sensitive data. The key challenge for differential privacy is determining a reasonable amount of privacy budget to balance privacy preserving and data utility. Most of the previous work applies unified privacy budget to all individual data, which leads to insufficient privacy protection for some individuals while over-protecting others. In KaPPA, the proposed personality-based privacy preserving approach automatically calculates privacy budget for each individual. Our experimental evaluations show a significant trade-off of sufficient privacy protection and data utility.

Place, publisher, year, edition, pages
Austin, TX, USA: ACM Digital Library, 2017. article id 45
Keywords [en]
Differential Privacy, Privacy-preserving Data Analysis
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-143228DOI: 10.1145/3148011.3154479ISBN: 978-1-4503-5553-7 (electronic)OAI: oai:DiVA.org:umu-143228DiVA, id: diva2:1167798
Conference
K-CAP 2017: The 9th International Conference on Knowledge Capture, Austin, Texas, December 4-6, 2017
Projects
Privacy-aware data federationAvailable from: 2017-12-19 Created: 2017-12-19 Last updated: 2019-08-22Bibliographically approved
In thesis
1. Privacy-awareness in the era of Big Data and machine learning
Open this publication in new window or tab >>Privacy-awareness in the era of Big Data and machine learning
2019 (English)Licentiate thesis, comprehensive summary (Other academic)
Alternative title[sv]
Integritetsmedvetenhet i eran av Big Data och maskininlärning
Abstract [en]

Social Network Sites (SNS) such as Facebook and Twitter, have been playing a great role in our lives. On the one hand, they help connect people who would not otherwise be connected before. Many recent breakthroughs in AI such as facial recognition [49] were achieved thanks to the amount of available data on the Internet via SNS (hereafter Big Data). On the other hand, due to privacy concerns, many people have tried to avoid SNS to protect their privacy. Similar to the security issue of the Internet protocol, Machine Learning (ML), as the core of AI, was not designed with privacy in mind. For instance, Support Vector Machines (SVMs) try to solve a quadratic optimization problem by deciding which instances of training dataset are support vectors. This means that the data of people involved in the training process will also be published within the SVM models. Thus, privacy guarantees must be applied to the worst-case outliers, and meanwhile data utilities have to be guaranteed.

For the above reasons, this thesis studies on: (1) how to construct data federation infrastructure with privacy guarantee in the big data era; (2) how to protect privacy while learning ML models with a good trade-off between data utilities and privacy. To the first point, we proposed different frameworks em- powered by privacy-aware algorithms that satisfied the definition of differential privacy, which is the state-of-the-art privacy-guarantee algorithm by definition. Regarding (2), we proposed different neural network architectures to capture the sensitivities of user data, from which, the algorithm itself decides how much it should learn from user data to protect their privacy while achieves good performance for a downstream task. The current outcomes of the thesis are: (1) privacy-guarantee data federation infrastructure for data analysis on sensitive data; (2) privacy-guarantee algorithms for data sharing; (3) privacy-concern data analysis on social network data. The research methods used in this thesis include experiments on real-life social network dataset to evaluate aspects of proposed approaches.

Insights and outcomes from this thesis can be used by both academic and industry to guarantee privacy for data analysis and data sharing in personal data. They also have the potential to facilitate relevant research in privacy-aware representation learning and related evaluation methods.

Place, publisher, year, edition, pages
Umeå: Department of computing science, Umeå University, 2019. p. 42
Series
Report / UMINF, ISSN 0348-0542 ; 19.06
Keywords
Diferential Privacy, Machine Learning, Deep Learning, Big Data
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-162182 (URN)9789178551101 (ISBN)
Presentation
2019-09-09, 23:40 (English)
Supervisors
Available from: 2019-08-22 Created: 2019-08-15 Last updated: 2019-08-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records BETA

Vu, Xuan-SonJiang, LiliBrändström, AndersElmroth, Erik

Search in DiVA

By author/editor
Vu, Xuan-SonJiang, LiliBrändström, AndersElmroth, Erik
By organisation
Department of Computing ScienceCentre for Demographic and Ageing Research (CEDAR)
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 202 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf