Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Convex formulations for training two-layer ReLU neural networks
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
Department of Computing, Imperial College London, United Kingdom.
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.ORCID iD: 0000-0001-7320-1506
2025 (English)In: 13th International Conference on Learning Representations, ICLR 2025, Curran Associates, Inc., 2025, p. 30682-30704Conference paper, Published paper (Refereed)
Abstract [en]

Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex formulations have been used for verifying neural network robustness, their application to training neural networks remains less explored. In response to this challenge, we reformulate the problem of training infinite-width two-layer ReLU networks as a convex completely positive program in a finite-dimensional (lifted) space. Despite the convexity, solving this problem remains NP-hard due to the complete positivity constraint. To overcome this challenge, we introduce a semidefinite relaxation that can be solved in polynomial time. We then experimentally evaluate the tightness of this relaxation, demonstrating its competitive performance in test accuracy across a range of classification tasks.

Place, publisher, year, edition, pages
Curran Associates, Inc., 2025. p. 30682-30704
Keywords [en]
copositive programming, semidefinite programming, neural networks
National Category
Artificial Intelligence
Identifiers
URN: urn:nbn:se:umu:diva-236599Scopus ID: 2-s2.0-105010230817ISBN: 979-8-3313-2085-0 (electronic)OAI: oai:DiVA.org:umu-236599DiVA, id: diva2:1945107
Conference
International Conference on Learning Representations (ICLR), Singapore, April 24-28, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Knut and Alice Wallenberg FoundationSwedish Research CouncilAvailable from: 2025-03-17 Created: 2025-03-17 Last updated: 2025-07-18Bibliographically approved

Open Access in DiVA

Accepted paper(4944 kB)208 downloads
File information
File name FULLTEXT01.pdfFile size 4944 kBChecksum SHA-512
e917859d9c98cf6d56cd2d6efeed9c2541ca3fdf4a689f46fabe75be9758c45207447f73b0dae74b50611a8176dc7e86750e54ba5d58cefcf9f2f2d204b9fe46
Type fulltextMimetype application/pdf

Other links

ScopusAccepted paper

Authority records

Prakhya, KarthikYurtsever, Alp

Search in DiVA

By author/editor
Prakhya, KarthikYurtsever, Alp
By organisation
Department of Mathematics and Mathematical Statistics
Artificial Intelligence

Search outside of DiVA

GoogleGoogle Scholar
Total: 208 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 157 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf