Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Prakhya, Karthik
Publications (3 of 3) Show all publications
Prakhya, K., Birdal, T. & Yurtsever, A. (2025). Convex formulations for training two-layer ReLU neural networks. In: 13th International Conference on Learning Representations, ICLR 2025: . Paper presented at International Conference on Learning Representations (ICLR), Singapore, April 24-28, 2025 (pp. 30682-30704). Curran Associates, Inc.
Open this publication in new window or tab >>Convex formulations for training two-layer ReLU neural networks
2025 (English)In: 13th International Conference on Learning Representations, ICLR 2025, Curran Associates, Inc., 2025, p. 30682-30704Conference paper, Published paper (Refereed)
Abstract [en]

Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex formulations have been used for verifying neural network robustness, their application to training neural networks remains less explored. In response to this challenge, we reformulate the problem of training infinite-width two-layer ReLU networks as a convex completely positive program in a finite-dimensional (lifted) space. Despite the convexity, solving this problem remains NP-hard due to the complete positivity constraint. To overcome this challenge, we introduce a semidefinite relaxation that can be solved in polynomial time. We then experimentally evaluate the tightness of this relaxation, demonstrating its competitive performance in test accuracy across a range of classification tasks.

Place, publisher, year, edition, pages
Curran Associates, Inc., 2025
Keywords
copositive programming, semidefinite programming, neural networks
National Category
Artificial Intelligence
Identifiers
urn:nbn:se:umu:diva-236599 (URN)2-s2.0-105010230817 (Scopus ID)979-8-3313-2085-0 (ISBN)
Conference
International Conference on Learning Representations (ICLR), Singapore, April 24-28, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Knut and Alice Wallenberg FoundationSwedish Research Council
Available from: 2025-03-17 Created: 2025-03-17 Last updated: 2025-07-18Bibliographically approved
Dadras, A., Banerjee, S., Prakhya, K. & Yurtsever, A. (2024). Federated Frank-Wolfe algorithm. In: Albert Bifet; Jesse Davis; Tomas Krilavičius; Meelis Kull; Eirini Ntoutsi; Indrė Žliobaitė (Ed.), Machine learning and knowledge discovery in databases. Research track: European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, proceedings, part III. Paper presented at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2024), Vilnius, Lithuania, September 9-13, 2024 (pp. 58-75). Springer Nature
Open this publication in new window or tab >>Federated Frank-Wolfe algorithm
2024 (English)In: Machine learning and knowledge discovery in databases. Research track: European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, proceedings, part III / [ed] Albert Bifet; Jesse Davis; Tomas Krilavičius; Meelis Kull; Eirini Ntoutsi; Indrė Žliobaitė, Springer Nature, 2024, p. 58-75Conference paper, Published paper (Refereed)
Abstract [en]

Federated learning (FL) has gained a lot of attention in recent years for building privacy-preserving collaborative learning systems. However, FL algorithms for constrained machine learning problems are still limited, particularly when the projection step is costly. To this end, we propose a Federated Frank-Wolfe Algorithm (FedFW). FedFW features data privacy, low per-iteration cost, and communication of sparse signals. In the deterministic setting, FedFW achieves an ε-suboptimal solution within O(ε-2) iterations for smooth and convex objectives, and O(ε-3) iterations for smooth but non-convex objectives. Furthermore, we present a stochastic variant of FedFW and show that it finds a solution within O(ε-3) iterations in the convex setting. We demonstrate the empirical performance of FedFW on several machine learning tasks.

Place, publisher, year, edition, pages
Springer Nature, 2024
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14943
Keywords
federated learning, frank wolfe, conditional gradient method, projection-free, distributed optimization
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-228614 (URN)10.1007/978-3-031-70352-2_4 (DOI)001308375900004 ()978-3-031-70351-5 (ISBN)978-3-031-70352-2 (ISBN)
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2024), Vilnius, Lithuania, September 9-13, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2023-05476
Note

Also part of the book sub series: Lecture Notes in Artificial Intelligence (LNAI). 

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2025-04-24Bibliographically approved
Dadras, A., Prakhya, K. & Yurtsever, A. (2022). Federated Frank-Wolfe Algorithm. In: : . Paper presented at FL-NeurIPS'22, International Workshop on Federated Learning: Recent Advances and New Challenges in Conjunction with NeurIPS 2022, New Orleans, LA, USA, December 2, 2022.
Open this publication in new window or tab >>Federated Frank-Wolfe Algorithm
2022 (English)Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Federated learning (FL) has gained much attention in recent years for building privacy-preserving collaborative learning systems. However, FL algorithms for constrained machine learning problems are still very limited, particularly when the projection step is costly. To this end, we propose a Federated Frank-Wolfe Algorithm (FedFW). FedFW provably finds an ε-suboptimal solution of the constrained empirical risk-minimization problem after O(ε−2) iterations if the objective function is convex. The rate becomes O(ε−3) if the objective is non-convex. The method enjoys data privacy, low per-iteration cost and communication of sparse signals. We demonstrate empirical performance of the FedFW algorithm on several machine learning tasks.

Keywords
federated learning, frank wolfe, conditional gradient method, projection-free, distributed optimization
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-205126 (URN)
Conference
FL-NeurIPS'22, International Workshop on Federated Learning: Recent Advances and New Challenges in Conjunction with NeurIPS 2022, New Orleans, LA, USA, December 2, 2022
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2023-02-23 Created: 2023-02-23 Last updated: 2025-01-23Bibliographically approved
Organisations

Search in DiVA

Show all publications