Umeå universitets logga

umu.sePublikationer
Driftstörningar
Just nu har vi driftstörningar på sök-portalerna på grund av hög belastning. Vi arbetar på att lösa problemet, ni kan tillfälligt mötas av ett felmeddelande.
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • apa-6th-edition.csl
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Cost-efficient feature selection for horizontal federated learning
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.ORCID-id: 0000-0002-3451-2851
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.ORCID-id: 0000-0002-2633-6798
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.ORCID-id: 0000-0002-9842-7840
2024 (Engelska)Ingår i: IEEE Transactions on Artificial Intelligence, E-ISSN 2691-4581, Vol. 5, nr 12, s. 6551-6565Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Horizontal Federated Learning exhibits substantial similarities in feature space across distinct clients. However, not all features contribute significantly to the training of the global model. Moreover, the curse of dimensionality delays the training. Therefore, reducing irrelevant and redundant features from the feature space makes training faster and inexpensive. This work aims to identify the common feature subset from the clients in federated settings. We introduce a hybrid approach called Fed-MOFS 1 , utilizing Mutual Information and Clustering for local feature selection at each client. Unlike the Fed-FiS, which uses a scoring function for global feature ranking, Fed-MOFS employs multi-objective optimization to prioritize features based on their higher relevance and lower redundancy. This paper compares the performance of Fed-MOFS 2 with conventional and federated feature selection methods. Moreover, we tested the scalability, stability, and efficacy of both Fed-FiS and Fed-MOFS across diverse datasets. We also assessed how feature selection influenced model convergence and explored its impact in scenarios with data heterogeneity. Our results show that Fed-MOFS enhances global model performance with a 50% reduction in feature space and is at least twice as fast as the FSHFL method. The computational complexity for both approaches is O( d 2 ), which is lower than the state-of-the-art.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2024. Vol. 5, nr 12, s. 6551-6565
Nyckelord [en]
Feature extraction, Computational modeling, Data models, Training, Federated learning, Artificial intelligence, Servers, Clustering, Horizontal Federated Learning, Feature Selection, Mutual Information, Multi-objective Optimization
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:umu:diva-228215DOI: 10.1109/TAI.2024.3436664Scopus ID: 2-s2.0-85200235298OAI: oai:DiVA.org:umu-228215DiVA, id: diva2:1886945
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)Tillgänglig från: 2024-08-05 Skapad: 2024-08-05 Senast uppdaterad: 2025-01-13Bibliografiskt granskad
Ingår i avhandling
1. Advancing federated learning: algorithms and use-cases
Öppna denna publikation i ny flik eller fönster >>Advancing federated learning: algorithms and use-cases
2024 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Alternativ titel[sv]
Förbättrad federerad maskininlärning : algoritmer och tillämpningar
Abstract [en]

Federated Learning (FL) is a distributed machine learning paradigm that enables the training of models across numerous clients or organizations without requiring the transfer of local data. This method addresses concerns about data privacy and ownership by keeping raw data on the client itself and only sharing model updates with a central server. Despite its benefits, federated learning faces unique challenges, such as data heterogeneity, computation and communication overheads, and the need for personalized models. Thereby results in reduced model performance, lower efficiency, and longer training times.

This thesis investigates these issues from theoretical, empirical, and practical application perspectives with four-fold contributions, such as federated feature selection, adaptive client selection, model personalization, and socio-cognitive applications. Firstly, we addressed the data heterogeneity problems for federated feature selection in horizontal FL by developing algorithms based on mutual information and multi-objective optimization. Secondly, we tackled system heterogeneity issues that involved variations in computation, storage, and communication capabilities among clients. We proposed a solution that ranks clients with multi-objective optimization for efficient, fair, and adaptive participation in model training. Thirdly, we addressed the issue of client drift caused by data heterogeneity in hierarchical federated learning with a personalized federated learning approach. Lastly, we focused on two key applications that benefit from the FL framework but suffer from data heterogeneity issues. The first application attempts to predict the level of autobiographic memory recall of events associated with the lifelog image by developing clustered personalized FL algorithms, which help in selecting effective lifelog image cues for cognitive interventions for the clients. The second application is the development of a personal image privacy advisor for each client. Along with data heterogeneity, the privacy advisor faces data scarcity issues. We developed a daisy chain-enabled clustered personalized FL algorithm, which predicts whether an image should be shared, kept private, or recommended for sharing by a third party.

Our findings reveal that the proposed methods significantly outperformed the current state-of-the-art FL  algorithms. Our methods deliver superior performance, earlier convergence, and training efficiency.

Ort, förlag, år, upplaga, sidor
Umeå: Umeå University, 2024. s. 84
Serie
Report / UMINF, ISSN 0348-0542 ; 24.09
Nyckelord
Federated Learning, Federated Feature Selection, Statistical Heterogeneity, System Heterogeneity, Model Personalization, Socio-Cognitive Applications
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:umu:diva-228863 (URN)978-91-8070-463-2 (ISBN)978-91-8070-464-9 (ISBN)
Disputation
2024-09-23, Hörsal HUM.D.210, Humanisthuset, Umeå, 13:00 (Engelska)
Opponent
Handledare
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Tillgänglig från: 2024-09-02 Skapad: 2024-08-27 Senast uppdaterad: 2024-08-28Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Banerjee, SourasekharElmroth, ErikBhuyan, Monowar H.

Sök vidare i DiVA

Av författaren/redaktören
Banerjee, SourasekharElmroth, ErikBhuyan, Monowar H.
Av organisationen
Institutionen för datavetenskap
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 152 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • apa-6th-edition.csl
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf