Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 13) Show all publications
Myllykoski, M. (2022). Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation. ACM Transactions on Mathematical Software, 48(1), 1-36, Article ID 11.
Open this publication in new window or tab >>Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation
2022 (English)In: ACM Transactions on Mathematical Software, ISSN 0098-3500, E-ISSN 1557-7295, Vol. 48, no 1, p. 1-36, article id 11Article in journal (Refereed) Published
Abstract [en]

The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eigenvectors of a dense nonsymmetric matrix. This paper describes a task-based QR algorithm for reducing an upper Hessenberg matrix to real Schur form. The task-based algorithm also supports generalized eigenvalue problems (QZ algorithm) but this paper concentrates on the standard case. The task-based algorithm adopts previous algorithmic improvements, such as tightly-coupled multi-shifts and Aggressive Early Deflation (AED), and also incorporates several new ideas that significantly improve the performance. This includes, but is not limited to, the elimination of several synchronization points, the dynamic merging of previously separate computational steps, the shortening and the prioritization of the critical path, and experimental GPU support. The task-based implementation is demonstrated to be multiple times faster than multi-threaded LAPACK and ScaLAPACK in both single-node and multi-node configurations on two different machines based on Intel and AMD CPUs. The implementation is built on top of the StarPU runtime system and is part of the open-source StarNEig library.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2022
Keywords
aggressive early deflation, MPI, multi-shift, QR algorithm, QZ algorithm, real Schur form, distributed memory, StarPU, shared memory, task-based, Eigenvalue problem, GPU
National Category
Computer Sciences Computational Mathematics
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-190558 (URN)10.1145/3495005 (DOI)000759468700012 ()2-s2.0-85125191396 (Scopus ID)
Funder
EU, Horizon 2020, 671633eSSENCE - An eScience CollaborationSwedish Research Council, E0485301
Available from: 2021-12-18 Created: 2021-12-18 Last updated: 2023-09-05Bibliographically approved
Bispo, J., Barbosa, J. G., Silva, P. F., Morales, C., Myllykoski, M., Ojeda-May, P., . . . Shoukourian, H. (2021). Best Practice Guide: Modern Accelerators.
Open this publication in new window or tab >>Best Practice Guide: Modern Accelerators
Show others...
2021 (English)Report (Other academic)
Publisher
p. 111
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-190729 (URN)
Available from: 2021-12-22 Created: 2021-12-22 Last updated: 2021-12-28Bibliographically approved
Myllykoski, M. & Kjelgaard Mikkelsen, C. C. (2021). Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems. Concurrency and Computation, 33(11), Article ID e5915.
Open this publication in new window or tab >>Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems
2021 (English)In: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 33, no 11, article id e5915Article in journal (Refereed) Published
Abstract [en]

In this paper, we present the StarNEig library for solving dense nonsymmetric standard and generalized eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some components of the library have support for GPU acceleration. The library currently applies to real matrices with real and complex eigenvalues and all calculations are done using real arithmetic. Support for complex matrices is planned for a future release. This paper is aimed at potential users of the library. We describe the design choices and capabilities of the library, and contrast them to existing software such as LAPACK and ScaLAPACK. StarNEig implements a ScaLAPACK compatibility layer which should assist new users in the transition to StarNEig. We demonstrate the performance of the library with a sample of computational experiments.

Place, publisher, year, edition, pages
John Wiley & Sons, 2021
Keywords
eigenvalue problem, parallel computing, task‐based, numerical library
National Category
Computer Sciences Computational Mathematics
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-173924 (URN)10.1002/cpe.5915 (DOI)000555868500001 ()2-s2.0-85089025359 (Scopus ID)
Projects
NLAFET
Available from: 2020-08-06 Created: 2020-08-06 Last updated: 2021-07-14Bibliographically approved
Myllykoski, M. & Kjelgaard Mikkelsen, C. C. (2020). Introduction to StarNEig: A Task-based Library for Solving Nonsymmetric Eigenvalue Problems. In: Roman Wyrzykowski and Boleslaw Szymanski (Ed.), Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I. Paper presented at 13th International Conference on Parallel Computing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019 (pp. 70-81). Springer
Open this publication in new window or tab >>Introduction to StarNEig: A Task-based Library for Solving Nonsymmetric Eigenvalue Problems
2020 (English)In: Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I / [ed] Roman Wyrzykowski and Boleslaw Szymanski, Springer, 2020, p. 70-81Conference paper, Published paper (Refereed)
Abstract [en]

Abstract. In this paper, we present the StarNEig library for solvingdense nonsymmetric (generalized) eigenvalue problems. The library isbuilt on top of the StarPU runtime system and targets both shared anddistributed memory machines. Some components of the library supportGPUs. The library is currently in an early beta state and only real arith-metic is supported. Support for complex data types is planned for afuture release. This paper is aimed at potential users of the library. Wedescribe the design choices and capabilities of the library, and contrastthem to existing software such as ScaLAPACK. StarNEig implements aScaLAPACK compatibility layer that should make it easy for new usersto transition to StarNEig. We demonstrate the performance of the librarywith a small set of computational experiments.

Place, publisher, year, edition, pages
Springer, 2020
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12043
Keywords
Eigenvalue problem, Task-based, Library
National Category
Computer Sciences
Research subject
Computer Science; Mathematics
Identifiers
urn:nbn:se:umu:diva-168419 (URN)10.1007/978-3-030-43229-4_7 (DOI)2-s2.0-85083964403 (Scopus ID)978-3-030-43228-7 (ISBN)978-3-030-43229-4 (ISBN)
Conference
13th International Conference on Parallel Computing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019
Projects
NLAFET
Available from: 2020-02-25 Created: 2020-02-25 Last updated: 2023-03-23Bibliographically approved
Kjelgaard Mikkelsen, C. C. & Myllykoski, M. (2020). Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils. In: Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski (Ed.), Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I. Paper presented at 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019 (pp. 58-69). Springer
Open this publication in new window or tab >>Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils
2020 (English)In: Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I / [ed] Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski, Springer, 2020, p. 58-69Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we consider the problem of computing generalized eigenvectors of a matrix pencil in real Schur form. In exact arithmetic, this problem can be solved using substitution. In practice, substitution is vulnerable to floating-point overflow. The robust solvers xtgevc in LAPACK prevent overflow by dynamically scaling the eigenvectors.These subroutines are scalar and sequential codes which compute theeigenvectors one by one. In this paper, we discuss how to derive robust algorithms which are blocked and parallel. The new StarNEig librarycontains a robust task-parallel solver Zazamoukh which runs on top of StarPU. Our numerical experiments show that Zazamoukh achieves a super-linear speedup compared with dtgevc for sufficiently large matrices.

Place, publisher, year, edition, pages
Springer, 2020
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12043
Keywords
Generalized eigenvectors, overflow protection, task-parallelism
National Category
Computer Sciences
Research subject
Computer Science; Mathematics
Identifiers
urn:nbn:se:umu:diva-168416 (URN)10.1007/978-3-030-43229-4_6 (DOI)2-s2.0-85083956421 (Scopus ID)978-3-030-43228-7 (ISBN)978-3-030-43229-4 (ISBN)
Conference
13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019
Projects
NLAFET
Available from: 2020-02-25 Created: 2020-02-25 Last updated: 2023-03-24Bibliographically approved
Myllykoski, M., Kjelgaard Mikkelsen, C. C., Schwarz, A. B. & Kågström, B. (2019). D2.7 Eigenvalue solvers for nonsymmetric problems. NLAFET Consortium; Umeå University
Open this publication in new window or tab >>D2.7 Eigenvalue solvers for nonsymmetric problems
2019 (English)Report (Other academic)
Place, publisher, year, edition, pages
NLAFET Consortium; Umeå University, 2019. p. 29
National Category
Computer Sciences
Research subject
Mathematics; Computer Science
Identifiers
urn:nbn:se:umu:diva-168424 (URN)
Projects
NLAFET
Note

This work is c by the NLAFET Consortium, 2015–2019. Its duplication is allowed only for personal, educational, or research uses.

Available from: 2020-02-25 Created: 2020-02-25 Last updated: 2023-03-07Bibliographically approved
Karlsson, L., Eljammaly, M. & Myllykoski, M. (2019). D6.5 Evaluation of auto-tuning techniques. NLAFET Consortium; Umeå University
Open this publication in new window or tab >>D6.5 Evaluation of auto-tuning techniques
2019 (English)Report (Other academic)
Place, publisher, year, edition, pages
NLAFET Consortium; Umeå University, 2019. p. 27
National Category
Computer Sciences
Research subject
Computer Science; Mathematics
Identifiers
urn:nbn:se:umu:diva-168425 (URN)
Projects
NLAFET
Note

This work is c by the NLAFET Consortium, 2015–2018. Its duplication is allowed only for personal, educational, or research uses.

Available from: 2020-02-25 Created: 2020-02-25 Last updated: 2020-02-26Bibliographically approved
Kågström, B., Myllykoski, M., Karlsson, L., Kjelgaard Mikkelsen, C. C., Cayrols, S., Duff, I., . . . Tissot, O. (2019). D7.8 Release of the NLAFET library. NLAFET Consortium; Umeå University
Open this publication in new window or tab >>D7.8 Release of the NLAFET library
Show others...
2019 (English)Report (Other academic)
Place, publisher, year, edition, pages
NLAFET Consortium; Umeå University, 2019. p. 27
National Category
Computer Sciences
Research subject
Mathematics; Computer Science
Identifiers
urn:nbn:se:umu:diva-168426 (URN)
Projects
NLAFET
Note

This work is c by the NLAFET Consortium, 2015–2019. Its duplication is allowed only for personal, educational, or research uses.

Available from: 2020-02-25 Created: 2020-02-25 Last updated: 2020-02-27Bibliographically approved
Myllykoski, M. (2018). A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form. In: Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski (Ed.), Parallel Processing and Applied Mathematics: PPAM 2017. Paper presented at 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017) (pp. 207-216). Springer
Open this publication in new window or tab >>A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form
2018 (English)In: Parallel Processing and Applied Mathematics: PPAM 2017 / [ed] Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski, Springer, 2018, p. 207-216Conference paper, Published paper (Refereed)
Abstract [en]

A task-based parallel algorithm for reordering the eigenvalues of a matrix in real Schur form is presented.The algorithm is realized on top of the StarPU runtime system.Only the aspects which are relevant for shared memory machines are discussed here, but the implementation can be configured to run on distributed memory machines as well.Various techniques to reduce the overhead and the core idle time are discussed.Computational experiments indicate that the new algorithm is between 1.5 and 6.6 times faster than a state of the art MPI-based implementation found in ScaLAPACK.With medium to large matrices, strong scaling efficiencies above 60\% up to 28 CPU cores are reported.The overhead and the core idle time are shown to be negligible with the exception of the smallest matrices and highest core counts.

Place, publisher, year, edition, pages
Springer, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10777
Keywords
Eigenvalue reordering problem, Task based programming, Shared memory machines
National Category
Computer Sciences Computational Mathematics
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-145987 (URN)10.1007/978-3-319-78024-5_19 (DOI)000458563300019 ()2-s2.0-85044751795 (Scopus ID)978-3-319-78023-8 (ISBN)978-3-319-78024-5 (ISBN)
Conference
12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017)
Available from: 2018-03-24 Created: 2018-03-24 Last updated: 2023-03-24Bibliographically approved
Myllykoski, M., Karlsson, L., Kågström, B., Eljammaly, M., Pranesh, S. & Zounon, M. (2018). D2.6 Prototype Software for Eigenvalue Problem Solvers. NLAFET Consortium; Umeå University
Open this publication in new window or tab >>D2.6 Prototype Software for Eigenvalue Problem Solvers
Show others...
2018 (English)Report (Other academic)
Place, publisher, year, edition, pages
NLAFET Consortium; Umeå University, 2018. p. 32
National Category
Computer Sciences
Research subject
Mathematics; Computer Science
Identifiers
urn:nbn:se:umu:diva-170222 (URN)
Projects
NLAFET
Note

Part of: Public Deliverables: WP2 – Dense Linear Systems and Eigenvalue Problem Solvers

Available from: 2020-04-29 Created: 2020-04-29 Last updated: 2020-05-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-3689-0899

Search in DiVA

Show all publications