Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cost-efficient feature selection for horizontal federated learning
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-3451-2851
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-2633-6798
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-9842-7840
2024 (English)In: IEEE Transactions on Artificial Intelligence, E-ISSN 2691-4581Article in journal (Refereed) Epub ahead of print
Abstract [en]

Horizontal Federated Learning exhibits substantial similarities in feature space across distinct clients. However, not all features contribute significantly to the training of the global model. Moreover, the curse of dimensionality delays the training. Therefore, reducing irrelevant and redundant features from the feature space makes training faster and inexpensive. This work aims to identify the common feature subset from the clients in federated settings. We introduce a hybrid approach called Fed-MOFS 1 , utilizing Mutual Information and Clustering for local feature selection at each client. Unlike the Fed-FiS, which uses a scoring function for global feature ranking, Fed-MOFS employs multi-objective optimization to prioritize features based on their higher relevance and lower redundancy. This paper compares the performance of Fed-MOFS 2 with conventional and federated feature selection methods. Moreover, we tested the scalability, stability, and efficacy of both Fed-FiS and Fed-MOFS across diverse datasets. We also assessed how feature selection influenced model convergence and explored its impact in scenarios with data heterogeneity. Our results show that Fed-MOFS enhances global model performance with a 50% reduction in feature space and is at least twice as fast as the FSHFL method. The computational complexity for both approaches is O( d 2 ), which is lower than the state-of-the-art.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024.
Keywords [en]
Feature extraction, Computational modeling, Data models, Training, Federated learning, Artificial intelligence, Servers, Clustering, Horizontal Federated Learning, Feature Selection, Mutual Information, Multi-objective Optimization
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-228215DOI: 10.1109/TAI.2024.3436664Scopus ID: 2-s2.0-85200235298OAI: oai:DiVA.org:umu-228215DiVA, id: diva2:1886945
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Available from: 2024-08-05 Created: 2024-08-05 Last updated: 2024-08-27
In thesis
1. Advancing federated learning: algorithms and use-cases
Open this publication in new window or tab >>Advancing federated learning: algorithms and use-cases
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Förbättrad federerad maskininlärning : algoritmer och tillämpningar
Abstract [en]

Federated Learning (FL) is a distributed machine learning paradigm that enables the training of models across numerous clients or organizations without requiring the transfer of local data. This method addresses concerns about data privacy and ownership by keeping raw data on the client itself and only sharing model updates with a central server. Despite its benefits, federated learning faces unique challenges, such as data heterogeneity, computation and communication overheads, and the need for personalized models. Thereby results in reduced model performance, lower efficiency, and longer training times.

This thesis investigates these issues from theoretical, empirical, and practical application perspectives with four-fold contributions, such as federated feature selection, adaptive client selection, model personalization, and socio-cognitive applications. Firstly, we addressed the data heterogeneity problems for federated feature selection in horizontal FL by developing algorithms based on mutual information and multi-objective optimization. Secondly, we tackled system heterogeneity issues that involved variations in computation, storage, and communication capabilities among clients. We proposed a solution that ranks clients with multi-objective optimization for efficient, fair, and adaptive participation in model training. Thirdly, we addressed the issue of client drift caused by data heterogeneity in hierarchical federated learning with a personalized federated learning approach. Lastly, we focused on two key applications that benefit from the FL framework but suffer from data heterogeneity issues. The first application attempts to predict the level of autobiographic memory recall of events associated with the lifelog image by developing clustered personalized FL algorithms, which help in selecting effective lifelog image cues for cognitive interventions for the clients. The second application is the development of a personal image privacy advisor for each client. Along with data heterogeneity, the privacy advisor faces data scarcity issues. We developed a daisy chain-enabled clustered personalized FL algorithm, which predicts whether an image should be shared, kept private, or recommended for sharing by a third party.

Our findings reveal that the proposed methods significantly outperformed the current state-of-the-art FL  algorithms. Our methods deliver superior performance, earlier convergence, and training efficiency.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2024. p. 84
Series
Report / UMINF, ISSN 0348-0542 ; 24.09
Keywords
Federated Learning, Federated Feature Selection, Statistical Heterogeneity, System Heterogeneity, Model Personalization, Socio-Cognitive Applications
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-228863 (URN)978-91-8070-463-2 (ISBN)978-91-8070-464-9 (ISBN)
Public defence
2024-09-23, Hörsal HUM.D.210, Humanisthuset, Umeå, 13:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-09-02 Created: 2024-08-27 Last updated: 2024-08-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Banerjee, SourasekharElmroth, ErikBhuyan, Monowar H.

Search in DiVA

By author/editor
Banerjee, SourasekharElmroth, ErikBhuyan, Monowar H.
By organisation
Department of Computing Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 52 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf