Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 22) Show all publications
Prakhya, K., Birdal, T. & Yurtsever, A. (2025). Convex formulations for training two-layer ReLU neural networks. In: 13th International Conference on Learning Representations, ICLR 2025: . Paper presented at International Conference on Learning Representations (ICLR), Singapore, April 24-28, 2025 (pp. 30682-30704). Curran Associates, Inc.
Open this publication in new window or tab >>Convex formulations for training two-layer ReLU neural networks
2025 (English)In: 13th International Conference on Learning Representations, ICLR 2025, Curran Associates, Inc., 2025, p. 30682-30704Conference paper, Published paper (Refereed)
Abstract [en]

Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex formulations have been used for verifying neural network robustness, their application to training neural networks remains less explored. In response to this challenge, we reformulate the problem of training infinite-width two-layer ReLU networks as a convex completely positive program in a finite-dimensional (lifted) space. Despite the convexity, solving this problem remains NP-hard due to the complete positivity constraint. To overcome this challenge, we introduce a semidefinite relaxation that can be solved in polynomial time. We then experimentally evaluate the tightness of this relaxation, demonstrating its competitive performance in test accuracy across a range of classification tasks.

Place, publisher, year, edition, pages
Curran Associates, Inc., 2025
Keywords
copositive programming, semidefinite programming, neural networks
National Category
Artificial Intelligence
Identifiers
urn:nbn:se:umu:diva-236599 (URN)2-s2.0-105010230817 (Scopus ID)979-8-3313-2085-0 (ISBN)
Conference
International Conference on Learning Representations (ICLR), Singapore, April 24-28, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Knut and Alice Wallenberg FoundationSwedish Research Council
Available from: 2025-03-17 Created: 2025-03-17 Last updated: 2025-07-18Bibliographically approved
Nguyen, A. D., Yurtsever, A., Sra, S. & Toh, K. C. (2025). Improved rates for stochastic variance-reduced difference-of-convex algorithms. In: : . Paper presented at CDC 2025, 64th IEEE Conference on Decision and Control, Rio de Janeiro, Brazil, December 9-12, 2025.
Open this publication in new window or tab >>Improved rates for stochastic variance-reduced difference-of-convex algorithms
2025 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

In this work, we propose and analyze DCA-PAGE, a novel algorithm that integrates the difference-of-convex algorithm (DCA) with the ProbAbilistic Gradient Estimator (PAGE) to solve structured nonsmooth difference-of-convex programs. In the finite-sum setting, our method achieves a gradient computation complexity of O(N+N1/2ε-2) with sample size N, surpassing the previous best-known complexity of O(N+N2/3ε-2) for stochastic variance-reduced (SVR) DCA methods. Furthermore, DCA-PAGE readily extends to online settings with a similar optimal gradient computation complexity O(b+b1/2ε-2) with batch size b, a significant advantage over existing SVR DCA approaches that only work for the finite-sum setting. We further refine our analysis with a gap function, which enables us to obtain comparable convergence guarantees under milder assumptions.

National Category
Mathematical sciences
Identifiers
urn:nbn:se:umu:diva-244644 (URN)
Conference
CDC 2025, 64th IEEE Conference on Decision and Control, Rio de Janeiro, Brazil, December 9-12, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-09-26 Created: 2025-09-26 Last updated: 2025-09-30
Dadras, A., Stich, S. U. & Yurtsever, A. (2025). Personalized federated learning via low-rank matrix optimization. Transactions on Machine Learning Research, 2025-August
Open this publication in new window or tab >>Personalized federated learning via low-rank matrix optimization
2025 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-AugustArticle in journal (Refereed) Published
Abstract [en]

Personalized Federated Learning (pFL) has gained significant attention for building a suite of models tailored to different clients. In pFL, the challenge lies in balancing the reliance on local datasets, which may lack representativeness, against the diversity of other clients’ models, whose quality and relevance are uncertain. Focusing on the clustered FL scenario, where devices are grouped based on similarities in their data distributions without prior knowledge of cluster memberships, we develop a mathematical model for pFL using low-rank matrix optimization. Building on this formulation, we propose a pFL approach leveraging the Burer-Monteiro factorization technique. We examine the convergence guarantees of the proposed method and present numerical experiments on training deep neural networks, demonstrating the empirical performance of the proposed method in scenarios where personalization is crucial.

National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:umu:diva-243776 (URN)2-s2.0-105014128535 (Scopus ID)
Funder
Swedish Research Council, 2023-05476Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-09-09 Created: 2025-09-09 Last updated: 2025-09-09Bibliographically approved
Banerjee, S., Dadras, A., Yurtsever, A. & Bhuyan, M. H. (2025). Personalized multi-tier federated learning. In: Mufti Mahmud; Maryam Doborjeh; Kevin Wong; Andrew Chi Sing Leung; Zohreh Doborjeh; M. Tanveer (Ed.), Neural information processing: 31st International Conference, ICONIP 2024, Auckland, New Zealand, December 2–6, 2024, proceedings, part II. Paper presented at ICONIP 2024, 31st International Conference on Neural Information Processing, Auckland, New Zealand, December 2-6, 2024 (pp. 192-207). Springer Nature
Open this publication in new window or tab >>Personalized multi-tier federated learning
2025 (English)In: Neural information processing: 31st International Conference, ICONIP 2024, Auckland, New Zealand, December 2–6, 2024, proceedings, part II / [ed] Mufti Mahmud; Maryam Doborjeh; Kevin Wong; Andrew Chi Sing Leung; Zohreh Doborjeh; M. Tanveer, Springer Nature, 2025, p. 192-207Conference paper, Published paper (Refereed)
Abstract [en]

The key challenge of personalized federated learning (PerFL) is to capture the statistical heterogeneity properties of data with inexpensive communications and gain customized performance for participating devices. To address these, we introduced personalized federated learning in multi-tier architecture (PerMFL) to obtain optimized and personalized local models when there are known team structures across devices. We provide theoretical guarantees of PerMFL, which offers linear convergence rates for smooth strongly convex problems and sub-linear convergence rates for smooth non-convex problems. We conduct numerical experiments demonstrating the robust empirical performance of PerMFL, outperforming the state-of-the-art in multiple personalized federated learning tasks.

Place, publisher, year, edition, pages
Springer Nature, 2025
Series
Communications in Computer and Information Science, ISSN 1865-0929, E-ISSN 1865-0937 ; 2283
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-228859 (URN)10.1007/978-981-96-6951-6_14 (DOI)2-s2.0-105010821830 (Scopus ID)978-981-96-6950-9 (ISBN)978-981-96-6951-6 (ISBN)
Conference
ICONIP 2024, 31st International Conference on Neural Information Processing, Auckland, New Zealand, December 2-6, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-08-27 Created: 2024-08-27 Last updated: 2025-07-28Bibliographically approved
Palenzuela, K., Dadras, A., Yurtsever, A. & Löfstedt, T. (2025). Provable reduction in communication rounds for non-smooth convex federated learning. In: 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP): . Paper presented at 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), Istanbul, Turkiye, August 31 - September 3, 2025. IEEE
Open this publication in new window or tab >>Provable reduction in communication rounds for non-smooth convex federated learning
2025 (English)In: 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), IEEE, 2025Conference paper, Published paper (Refereed)
Abstract [en]

Multiple local steps are key to communication-efficient federated learning. However, theoretical guarantees for such algorithms, without data heterogeneity-bounding assumptions, have been lacking in general non-smooth convex problems. Leveraging projection-efficient optimization methods, we propose FedMLS, a federated learning algorithm with provable improvements from multiple local steps. FedMLS attains an ϵ-suboptimal solution in O(1/ϵ) communication rounds, requiring a total of O(1/ϵ2) stochastic subgradient oracle calls.

Place, publisher, year, edition, pages
IEEE, 2025
Series
Machine learning for signal processing, ISSN 1551-2541, E-ISSN 2161-0371
National Category
Computer Sciences Artificial Intelligence Algorithms
Identifiers
urn:nbn:se:umu:diva-246282 (URN)10.1109/MLSP62443.2025.11204301 (DOI)979-8-3315-7030-9 (ISBN)979-8-3315-7029-3 (ISBN)
Conference
2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), Istanbul, Turkiye, August 31 - September 3, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-11-10 Created: 2025-11-10 Last updated: 2025-11-21Bibliographically approved
Maskan, H., Halvachi, P., Sra, S. & Yurtsever, A. (2025). Randomized block coordinate DC algorithm. EURO Journal on Computational Optimization, 13, Article ID 100123.
Open this publication in new window or tab >>Randomized block coordinate DC algorithm
2025 (English)In: EURO Journal on Computational Optimization, ISSN 2192-4406, Vol. 13, article id 100123Article in journal (Refereed) Published
Abstract [en]

We introduce an extension of the Difference of Convex Algorithm (DCA) in the form of a randomized block coordinate approach for problems with separable structure. For n coordinate-blocks and k iterations, our main result proves a non-asymptotic convergence rate of O(n/k) in expectation, with respect to a stationarity measure based on a Forward-Backward envelope. Furthermore, leveraging the connection between DCA and Expectation Maximization (EM), we propose a randomized block coordinate EM algorithm.

Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
Block coordinate methods, Difference of convex, Nonconvex optimization
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-247450 (URN)10.1016/j.ejco.2025.100123 (DOI)2-s2.0-105023553755 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2025-12-12 Created: 2025-12-12 Last updated: 2025-12-12Bibliographically approved
Dadras, A., Banerjee, S., Prakhya, K. & Yurtsever, A. (2024). Federated Frank-Wolfe algorithm. In: Albert Bifet; Jesse Davis; Tomas Krilavičius; Meelis Kull; Eirini Ntoutsi; Indrė Žliobaitė (Ed.), Machine learning and knowledge discovery in databases. Research track: European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, proceedings, part III. Paper presented at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2024), Vilnius, Lithuania, September 9-13, 2024 (pp. 58-75). Springer Nature
Open this publication in new window or tab >>Federated Frank-Wolfe algorithm
2024 (English)In: Machine learning and knowledge discovery in databases. Research track: European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, proceedings, part III / [ed] Albert Bifet; Jesse Davis; Tomas Krilavičius; Meelis Kull; Eirini Ntoutsi; Indrė Žliobaitė, Springer Nature, 2024, p. 58-75Conference paper, Published paper (Refereed)
Abstract [en]

Federated learning (FL) has gained a lot of attention in recent years for building privacy-preserving collaborative learning systems. However, FL algorithms for constrained machine learning problems are still limited, particularly when the projection step is costly. To this end, we propose a Federated Frank-Wolfe Algorithm (FedFW). FedFW features data privacy, low per-iteration cost, and communication of sparse signals. In the deterministic setting, FedFW achieves an ε-suboptimal solution within O(ε-2) iterations for smooth and convex objectives, and O(ε-3) iterations for smooth but non-convex objectives. Furthermore, we present a stochastic variant of FedFW and show that it finds a solution within O(ε-3) iterations in the convex setting. We demonstrate the empirical performance of FedFW on several machine learning tasks.

Place, publisher, year, edition, pages
Springer Nature, 2024
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14943
Keywords
federated learning, frank wolfe, conditional gradient method, projection-free, distributed optimization
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-228614 (URN)10.1007/978-3-031-70352-2_4 (DOI)001308375900004 ()978-3-031-70351-5 (ISBN)978-3-031-70352-2 (ISBN)
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2024), Vilnius, Lithuania, September 9-13, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2023-05476
Note

Also part of the book sub series: Lecture Notes in Artificial Intelligence (LNAI). 

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2025-04-24Bibliographically approved
Dadras, A., Stich, S. U. & Yurtsever, A. (2024). Personalized federated learning via low-rank matrix factorization. In: OPT 2024: optimization for machine learning. Paper presented at NeurIPS 2024 Workshop, OPT2024: 16th Annual Workshop on Optimization for Machine Learning, Vancouver, Canada, December 15, 2024. , Article ID 130.
Open this publication in new window or tab >>Personalized federated learning via low-rank matrix factorization
2024 (English)In: OPT 2024: optimization for machine learning, 2024, article id 130Conference paper, Published paper (Refereed)
Abstract [en]

Personalized Federated Learning (pFL) has gained attention for building a suite of models tailored to different clients. In pFL, the challenge lies in balancing the reliance on local datasets, which may lack representativeness, against the diversity of other clients’ models, whose quality and relevance are uncertain. Focusing on the clustered FL scenario, where devices are grouped based on similarities intheir data distributions without prior knowledge of cluster memberships, we develop a mathematical model for pFL using low-rank matrix optimization. Building on this formulation, we propose a pFL approach leveraging the Burer-Monteiro factorization technique. We examine the convergence guarantees of the proposed method, and present numerical experiments on training deep neural networks, demonstrating the empirical performance of the proposed method in scenarios where personalization is crucial. 

Keywords
Personalized Federated Learning, Machine Learning, Optimization
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-234480 (URN)
Conference
NeurIPS 2024 Workshop, OPT2024: 16th Annual Workshop on Optimization for Machine Learning, Vancouver, Canada, December 15, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2023-05476
Available from: 2025-01-23 Created: 2025-01-23 Last updated: 2025-01-23Bibliographically approved
Maskan, H., Zygalakis, K. C. & Yurtsever, A. (2023). A Variational Perspective on High-Resolution ODEs. In: Advances in Neural Information Processing Systems 36 (NeurIPS 2023): . Paper presented at 37th Conference on Neural Information Processing Systems, NeurIPS 2023, New Orleans, USA, December 10-16, 2023. Neural information processing systems foundation
Open this publication in new window or tab >>A Variational Perspective on High-Resolution ODEs
2023 (English)In: Advances in Neural Information Processing Systems 36 (NeurIPS 2023), Neural information processing systems foundation , 2023Conference paper, Published paper (Refereed)
Abstract [en]

We consider unconstrained minimization of smooth convex functions. We propose a novel variational perspective using forced Euler-Lagrange equation that allows for studying high-resolution ODEs. Through this, we obtain a faster convergence rate for gradient norm minimization using Nesterov's accelerated gradient method. Additionally, we show that Nesterov's method can be interpreted as a rate-matching discretization of an appropriately chosen high-resolution ODE. Finally, using the results from the new variational perspective, we propose a stochastic method for noisy gradients. Several numerical experiments compare and illustrate our stochastic algorithm with state of the art methods.

Place, publisher, year, edition, pages
Neural information processing systems foundation, 2023
Series
Advances in neural information processing systems, ISSN 1049-5258
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-223951 (URN)2-s2.0-85191188553 (Scopus ID)
Conference
37th Conference on Neural Information Processing Systems, NeurIPS 2023, New Orleans, USA, December 10-16, 2023
Available from: 2024-05-03 Created: 2024-05-03 Last updated: 2024-07-02Bibliographically approved
Yurtsever, A. & Sra, S. (2022). CCCP is Frank-Wolfe in Disguise. In: S. Koyejo; S. Mohamed; A. Agarwal; D. Belgrave; K. Cho; A. Oh (Ed.), Advances in Neural Information Processing Systems 35 (NeurIPS 2022): . Paper presented at NeurIPS 2022, Thirty-sixth Conference on Neural Information Processing Systems, Hybrid via New Orleans, USA, November 28 - December 9, 2022.
Open this publication in new window or tab >>CCCP is Frank-Wolfe in Disguise
2022 (English)In: Advances in Neural Information Processing Systems 35 (NeurIPS 2022) / [ed] S. Koyejo; S. Mohamed; A. Agarwal; D. Belgrave; K. Cho; A. Oh, 2022Conference paper, Published paper (Refereed)
Abstract [en]

This paper uncovers a simple but rather surprising connection: it shows that the well-known convex-concave procedure (CCCP) and its generalization to constrained problems are both special cases of the Frank-Wolfe (FW) method. This connection not only provides insight of deep (in our opinion) pedagogical value, but also transfers the recently discovered convergence theory of nonconvex Frank-Wolfe methods immediately to CCCP, closing a long-standing gap in its non-asymptotic convergence theory. We hope the viewpoint uncovered by this paper spurs the transfer of other advances made for FW to both CCCP and its generalizations.

Series
Advances in neural information processing systems, ISSN 1049-5258
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-200701 (URN)2-s2.0-85162842285 (Scopus ID)9781713871088 (ISBN)
Conference
NeurIPS 2022, Thirty-sixth Conference on Neural Information Processing Systems, Hybrid via New Orleans, USA, November 28 - December 9, 2022
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2022-10-31 Created: 2022-10-31 Last updated: 2024-07-02Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-7320-1506

Search in DiVA

Show all publications