Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Eftekhari, Armin
Publications (10 of 11) Show all publications
Maskan, H., Zygalakis, K. C., Eftekhari, A. & Yurtsever, A. (2025). A unified model for high-resolution ODEs: new insights on accelerated methods.
Open this publication in new window or tab >>A unified model for high-resolution ODEs: new insights on accelerated methods
2025 (English)Manuscript (preprint) (Other academic)
Abstract [en]

Recent work on high-resolution ordinary differential equations (HR-ODEs) captures fine nuances among different momentum-based optimization methods, leading to accurate theoretical insights. However, these HR-ODEs often appear disconnected, each targeting a specific algorithm and derived with different assumptions and techniques. We present a unifying framework by showing that these diverse HR-ODEs emerge as special cases of a general HR-ODE derived using the Forced Euler-Lagrange equation. Discretizing this model recovers a wide range of optimization algorithms through different parameter choices. Using integral quadratic constraints, we also introduce a general Lyapunov function to analyze the convergence of the proposed HR-ODE and its discretizations, achieving significant improvements across various cases, including new guarantees for the triple momentum method s HR-ODE and the quasi-hyperbolic momentum method, as well as faster gradient norm minimization rates for Nesterov s accelerated gradient algorithm, among other advances.

National Category
Computational Mathematics
Research subject
Mathematics
Identifiers
urn:nbn:se:umu:diva-248743 (URN)10.48550/arXiv.2503.15136 (DOI)
Available from: 2026-01-22 Created: 2026-01-22 Last updated: 2026-01-22Bibliographically approved
Goyens, F., Cartis, C. & Eftekhari, A. (2023). Nonlinear matrix recovery using optimization on the Grassmann manifold. Applied and Computational Harmonic Analysis, 62, 498-542
Open this publication in new window or tab >>Nonlinear matrix recovery using optimization on the Grassmann manifold
2023 (English)In: Applied and Computational Harmonic Analysis, ISSN 1063-5203, E-ISSN 1096-603X, Vol. 62, p. 498-542Article in journal (Refereed) Published
Abstract [en]

We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by a constrained non-convex optimization problem involving the Grassmann manifold. We propose two sets of algorithms, one arising from Riemannian optimization and the other as an alternating minimization scheme, both of which include first- and second-order variants. Both sets of algorithms have theoretical guarantees. In particular, for the alternating minimization, we establish global convergence and worst-case complexity bounds. Additionally, using the Kurdyka-Lojasiewicz property, we show that the alternating minimization converges to a unique limit point. We provide extensive numerical results for the recovery of union of subspaces and clustering under entry sampling and dense Gaussian sampling. Our methods are competitive with existing approaches and, in particular, high accuracy is achieved in the recovery using Riemannian second-order methods.

Place, publisher, year, edition, pages
Elsevier, 2023
Keywords
Nonlinear matrix recovery, Nonconvex optimization, Riemannian optimization, Second-order methods
National Category
Computational Mathematics Mathematical Analysis
Identifiers
urn:nbn:se:umu:diva-202159 (URN)10.1016/j.acha.2022.11.001 (DOI)2-s2.0-85142157294 (Scopus ID)
Note

Errata: Florentin Goyens, Coralia Cartis, Armin Eftekhari, Corrigendum to "Nonlinear matrix recovery using optimization on the Grassmann manifold" [Appl. Comput. Harmon. Anal. 62 (2023) 498–542], Applied and Computational Harmonic Analysis, Volume 63, 2023, Page 93, ISSN 1063-5203, https://doi.org/10.1016/j.acha.2022.12.004

Available from: 2023-01-03 Created: 2023-01-03 Last updated: 2023-01-03Bibliographically approved
Eftekhari, A., Vargas, L. & Zygalakis, K. C. (2023). The forward–backward envelope for sampling with the overdamped Langevin algorithm. Statistics and computing, 33(4), Article ID 85.
Open this publication in new window or tab >>The forward–backward envelope for sampling with the overdamped Langevin algorithm
2023 (English)In: Statistics and computing, ISSN 0960-3174, E-ISSN 1573-1375, Vol. 33, no 4, article id 85Article in journal (Refereed) Published
Abstract [en]

In this paper, we analyse a proximal method based on the idea of forward–backward splitting for sampling from distributions with densities that are not necessarily smooth. In particular, we study the non-asymptotic properties of the Euler–Maruyama discretization of the Langevin equation, where the forward–backward envelope is used to deal with the non-smooth part of the dynamics. An advantage of this envelope, when compared to widely-used Moreu–Yoshida one and the MYULA algorithm, is that it maintains the MAP estimator of the original non-smooth distribution. We also study a number of numerical experiments that support our theoretical findings.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Convex optimization, Langevin equation, Markov chain Monte Carlo
National Category
Probability Theory and Statistics Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-209544 (URN)10.1007/s11222-023-10254-y (DOI)000999342700002 ()2-s2.0-85160908171 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Knut and Alice Wallenberg Foundation
Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2023-09-05Bibliographically approved
Eftekhari, A. (2022). Over-parametrized matrix factorization in the presence of spurious stationary points. IEEE Transactions on Signal Processing, 70, 482-496
Open this publication in new window or tab >>Over-parametrized matrix factorization in the presence of spurious stationary points
2022 (English)In: IEEE Transactions on Signal Processing, ISSN 1053-587X, E-ISSN 1941-0476, Vol. 70, p. 482-496Article in journal (Refereed) Published
Abstract [en]

Motivated by the emerging role of interpolating machines in signal processing and machine learning, this work considers the computational aspects of over-parametrized matrix factorization. In this context, the optimization landscape may contain spurious stationary points (SSPs), which are proved to be full-rank matrices. The presence of these SSPs means that it is impossible to hope for any global guarantees in over-parametrized matrix factorization. For example, when initialized at an SSP, the gradient flow will be trapped there forever. Nevertheless, despite these SSPs, we establish in this work that the gradient flow of the corresponding merit function converges to a global minimizer, provided that its initialization is rank-deficient and sufficiently close to the feasible set of the optimization problem. We numerically observe that a heuristic discretization of the proposed gradient flow, inspired by primal-dual algorithms, is successful when initialized randomly. Our result is in sharp contrast with the local refinement methods which require an initialization close to the optimal set of the optimization problem. More specifically, we successfully avoid the traps set by the SSPs because the gradient flow remains rank-deficient at all times, and not because there are no SSPs nearby. The latter is the case for the local refinement methods. Moreover, the widely-used restricted isometry property plays no role in our main result.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Keywords
Interpolation, Neural networks, Optimization, Signal processing, Signal processing algorithms, Toy manufacturing industry, Training
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-203062 (URN)10.1109/TSP.2021.3139213 (DOI)000747441900001 ()2-s2.0-85122305088 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation, 10.13039/501100004063
Available from: 2023-01-16 Created: 2023-01-16 Last updated: 2023-01-16Bibliographically approved
Eftekhari, A., Vandereycken, B., Vilmart, G. & Zygalakis, K. C. (2021). Explicit stabilised gradient descent for faster strongly convex optimisation. BIT Numerical Mathematics, 61, 119-139
Open this publication in new window or tab >>Explicit stabilised gradient descent for faster strongly convex optimisation
2021 (English)In: BIT Numerical Mathematics, ISSN 0006-3835, E-ISSN 1572-9125, Vol. 61, p. 119-139Article in journal (Refereed) Published
Abstract [en]

This paper introduces the Runge-Kutta Chebyshev descent method (RKCD) for strongly convex optimisation problems. This new algorithm is based on explicit stabilised integrators for stiff differential equations, a powerful class of numerical schemes that avoid the severe step size restriction faced by standard explicit integrators. For optimising quadratic and strongly convex functions, this paper proves that RKCD nearly achieves the optimal convergence rate of the conjugate gradient algorithm, and the suboptimality of RKCD diminishes as the condition number of the quadratic function worsens. It is established that this optimal rate is obtained also for a partitioned variant of RKCD applied to perturbations of quadratic functions. In addition, numerical experiments on general strongly convex problems show that RKCD outperforms Nesterov's accelerated gradient descent.

Place, publisher, year, edition, pages
Springer, 2021
Keywords
Runge-Kutta methods, Strongly convex optimization, Accelerated gradient descent
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-173636 (URN)10.1007/s10543-020-00819-y (DOI)000545282100001 ()2-s2.0-85087566688 (Scopus ID)
Available from: 2020-07-21 Created: 2020-07-21 Last updated: 2021-07-02Bibliographically approved
Vreugdenhil, R., Nguyen, V. A., Eftekhari, A. & Esfahani, P. M. (2021). Principal component hierarchy for sparse quadratic programs. In: Marina Meila; Tong Zhang (Ed.), Proceedings of machine learning research: . Paper presented at 38th International Conference on Machine Learning, ICML 2021, Online, July 18-24, 2021. (pp. 10607-10616). ML Research Press
Open this publication in new window or tab >>Principal component hierarchy for sparse quadratic programs
2021 (English)In: Proceedings of machine learning research / [ed] Marina Meila; Tong Zhang, ML Research Press , 2021, p. 10607-10616Conference paper, Published paper (Refereed)
Abstract [en]

We propose a novel approximation hierarchy for cardinality-constrained, convex quadratic programs that exploits the rank-dominating eigenvectors of the quadratic matrix. Each level of approximation admits a min-max characterization whose objective function can be optimized over the binary variables analytically, while preserving convexity in the continuous variables. Exploiting this property, we propose two scalable optimization algorithms, coined as the “best response” and the “dual program”, that can efficiently screen the potential indices of the nonzero elements of the original program. We show that the proposed methods are competitive with the existing screening methods in the current sparse regression literature, and it is particularly fast on instances with high number of measurements in experiments with both synthetic and real datasets.

Place, publisher, year, edition, pages
ML Research Press, 2021
Series
Proceedings of machine learning research, E-ISSN 2640-3498
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-210217 (URN)2-s2.0-85161314586 (Scopus ID)9781713845065 (ISBN)
Conference
38th International Conference on Machine Learning, ICML 2021, Online, July 18-24, 2021.
Available from: 2023-06-28 Created: 2023-06-28 Last updated: 2023-06-28Bibliographically approved
Eftekhari, A., Bendory, T. & Tang, G. (2021). Stable super-resolution of images: Theoretical study. Information and Inference, 10(1), 161-193
Open this publication in new window or tab >>Stable super-resolution of images: Theoretical study
2021 (English)In: Information and Inference, E-ISSN 2049-8772, Vol. 10, no 1, p. 161-193Article in journal (Refereed) Published
Abstract [en]

We study the ubiquitous super-resolution problem, in which one aims at localizing positive point sources in an image, blurred by the point spread function of the imaging device. To recover the point sources, we propose to solve a convex feasibility program, which simply finds a non-negative Borel measure that agrees with the observations collected by the imaging device. In the absence of imaging noise, we show that solving this convex program uniquely retrieves the point sources, provided that the imaging device collects enough observations. This result holds true if the point spread function of the imaging device can be decomposed into horizontal and vertical components and if the translations of these components form a Chebyshev system, i.e., a system of continuous functions that loosely behave like algebraic polynomials. Building upon the recent results for one-dimensional signals, we prove that this super-resolution algorithm is stable, in the generalized Wasserstein metric, to model mismatch (i.e., when the image is not sparse) and to additive imaging noise. In particular, the recovery error depends on the noise level and how well the image can be approximated with well-separated point sources. As an example, we verify these claims for the important case of a Gaussian point spread function. The proofs rely on the construction of novel interpolating polynomials - which are the main technical contribution of this paper - and partially resolve the question raised in Schiebinger et al. (2017, Inf. Inference, 7, 1-30) about the extension of the standard machinery to higher dimensions.

Place, publisher, year, edition, pages
Oxford University Press, 2021
National Category
Computational Mathematics Computer graphics and computer vision
Identifiers
urn:nbn:se:umu:diva-182946 (URN)10.1093/imaiai/iaaa029 (DOI)000637282400005 ()2-s2.0-85104849690 (Scopus ID)
Available from: 2021-05-11 Created: 2021-05-11 Last updated: 2025-02-01Bibliographically approved
Song, C., Ramezani-Kebrya, A., Pethick, T., Eftekhari, A. & Cevher, V. (2021). Subquadratic overparameterization for shallow neural networks. In: M. Ranzato; A. Beygelzimer; Y. Dauphin; P.S. Liang; J. Wortman Vaughan (Ed.), Advances in neural information processing systems (NeurIPS 2021): . Paper presented at 35th Conference on Neural Information Processing Systems, NeurIPS 2021, Virtual (Online), 6 December 6-14, 2021. (pp. 11247-11259). Curran Associates, Inc., 34
Open this publication in new window or tab >>Subquadratic overparameterization for shallow neural networks
Show others...
2021 (English)In: Advances in neural information processing systems (NeurIPS 2021) / [ed] M. Ranzato; A. Beygelzimer; Y. Dauphin; P.S. Liang; J. Wortman Vaughan, Curran Associates, Inc., 2021, Vol. 34, p. 11247-11259Conference paper, Published paper (Refereed)
Abstract [en]

Overparameterization refers to the important phenomenon where the width of a neural network is chosen such that learning algorithms can provably attain zero loss in nonconvex training. The existing theory establishes such global convergence using various initialization strategies, training modifications, and width scalings. In particular, the state-of-the-art results require the width to scale quadratically with the number of training data under standard initialization strategies used in practice for best generalization performance. In contrast, the most recent results obtain linear scaling either with requiring initializations that lead to the “lazy-training”, or training only a single layer. In this work, we provide an analytical framework that allows us to adopt standard initialization strategies, possibly avoid lazy training, and train all layers simultaneously in basic shallow neural networks while attaining a desirable subquadratic scaling on the network width. We achieve the desiderata via Polyak-Łojasiewicz condition, smoothness, and standard assumptions on data, and use tools from random matrix theory.

Place, publisher, year, edition, pages
Curran Associates, Inc., 2021
Series
Conference on Neural Information Processing Systems, ISSN 1049-5258
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-196833 (URN)2-s2.0-85131797261 (Scopus ID)9781713845393 (ISBN)
Conference
35th Conference on Neural Information Processing Systems, NeurIPS 2021, Virtual (Online), 6 December 6-14, 2021.
Funder
EU, Horizon 2020, 725594EU, European Research Council
Available from: 2022-06-21 Created: 2022-06-21 Last updated: 2022-06-21Bibliographically approved
Rolland, P., Eftekhari, A., Kavis, A. & Cevher, V. (2020). Double-loop unadjusted langevin algorithm. In: Hal Daumé III, Aarti Singh (Ed.), 37th International Conference on Machine Learning, ICML 2020: . Paper presented at 37th International Conference on Machine Learning, ICML 2020, Online, 13-18 July, 2020. (pp. 8139-8147). International Machine Learning Society (IMLS)
Open this publication in new window or tab >>Double-loop unadjusted langevin algorithm
2020 (English)In: 37th International Conference on Machine Learning, ICML 2020 / [ed] Hal Daumé III, Aarti Singh, International Machine Learning Society (IMLS) , 2020, p. 8139-8147Conference paper, Published paper (Refereed)
Abstract [en]

A well-known first-order method for sampling from log-concave probability distributions is the Unadjusted Langevin Algorithm (ULA). This work proposes a new annealing step-size schedule for ULA, which allows to prove new convergence guarantees for sampling from a smooth log-concave distribution, which are not covered by existing state-of-the-art convergence guarantees. To establish this result, we derive a new theoretical bound that relates the Wasserstein distance to total variation distance between any two log-concave distributions that complements the reach of Talagrand T2 inequality. Moreover, applying this new step size schedule to an existing constrained sampling algorithm, we show stateof- the-art convergence rates for sampling from a constrained log-concave distribution, as well as improved dimension dependence.

Place, publisher, year, edition, pages
International Machine Learning Society (IMLS), 2020
National Category
Computational Mathematics Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-183574 (URN)2-s2.0-85105307677 (Scopus ID)9781713821120 (ISBN)
Conference
37th International Conference on Machine Learning, ICML 2020, Online, 13-18 July, 2020.
Available from: 2021-06-02 Created: 2021-06-02 Last updated: 2021-06-02Bibliographically approved
Sanchez, T., Gözcü, B., van Heeswijk, R. B., Eftekhari, A., Ilicak, E., Çukur, T. & Cevher, V. (2020). Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRI. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): . Paper presented at IEEE International Conference on Acoustics, Speech, and Signal Processing, MAY 04-08, 2020, Barcelona, SPAIN (pp. 8584-8588). IEEE
Open this publication in new window or tab >>Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRI
Show others...
2020 (English)In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2020, p. 8584-8588Conference paper, Published paper (Refereed)
Abstract [en]

Compressed sensing applied to magnetic resonance imaging (MRI) allows to reduce the scanning time by enabling images to be reconstructed from highly undersampled data. In this paper, we tackle the problem of designing a sampling mask for an arbitrary reconstruction method and a limited acquisition budget. Namely, we look for an optimal probability distribution from which a mask with a fixed cardinality is drawn. We demonstrate that this problem admits a compactly supported solution, which leads to a deterministic optimal sampling mask. We then propose a stochastic greedy algorithm that (i) provides an approximate solution to this problem, and (ii) resolves the scaling issues of [1, 2]. We validate its performance on in vivo dynamic MRI with retrospective undersampling, showing that our method preserves the performance of [1, 2] while reducing the computational burden by a factor close to 200. Our implementation is available at https://github.com/t-sanchez/stochasticGreedyMRI.

Place, publisher, year, edition, pages
IEEE, 2020
Series
International Conference on Acoustics Speech and Signal Processing ICASSP, ISSN 1520-6149
Keywords
Magnetic resonance imaging, compressive sensing (CS), learning-based sampling
National Category
Radiology, Nuclear Medicine and Medical Imaging Medical Imaging
Identifiers
urn:nbn:se:umu:diva-187150 (URN)10.1109/ICASSP40776.2020.9053345 (DOI)000615970408171 ()2-s2.0-85089239455 (Scopus ID)978-1-5090-6631-5 (ISBN)978-1-5090-6632-2 (ISBN)
Conference
IEEE International Conference on Acoustics, Speech, and Signal Processing, MAY 04-08, 2020, Barcelona, SPAIN
Funder
EU, Horizon 2020, 725594EU, European Research Council
Available from: 2021-09-13 Created: 2021-09-13 Last updated: 2025-02-09Bibliographically approved
Organisations

Search in DiVA

Show all publications