Umeå University's logo

umu.sePublikasjoner
Endre søk
Link to record
Permanent link

Direct link
Publikasjoner (10 av 13) Visa alla publikasjoner
Myllykoski, M. (2022). Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation. ACM Transactions on Mathematical Software, 48(1), 1-36, Article ID 11.
Åpne denne publikasjonen i ny fane eller vindu >>Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation
2022 (engelsk)Inngår i: ACM Transactions on Mathematical Software, ISSN 0098-3500, E-ISSN 1557-7295, Vol. 48, nr 1, s. 1-36, artikkel-id 11Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eigenvectors of a dense nonsymmetric matrix. This paper describes a task-based QR algorithm for reducing an upper Hessenberg matrix to real Schur form. The task-based algorithm also supports generalized eigenvalue problems (QZ algorithm) but this paper concentrates on the standard case. The task-based algorithm adopts previous algorithmic improvements, such as tightly-coupled multi-shifts and Aggressive Early Deflation (AED), and also incorporates several new ideas that significantly improve the performance. This includes, but is not limited to, the elimination of several synchronization points, the dynamic merging of previously separate computational steps, the shortening and the prioritization of the critical path, and experimental GPU support. The task-based implementation is demonstrated to be multiple times faster than multi-threaded LAPACK and ScaLAPACK in both single-node and multi-node configurations on two different machines based on Intel and AMD CPUs. The implementation is built on top of the StarPU runtime system and is part of the open-source StarNEig library.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2022
Emneord
aggressive early deflation, MPI, multi-shift, QR algorithm, QZ algorithm, real Schur form, distributed memory, StarPU, shared memory, task-based, Eigenvalue problem, GPU
HSV kategori
Forskningsprogram
datalogi
Identifikatorer
urn:nbn:se:umu:diva-190558 (URN)10.1145/3495005 (DOI)000759468700012 ()2-s2.0-85125191396 (Scopus ID)
Forskningsfinansiär
EU, Horizon 2020, 671633eSSENCE - An eScience CollaborationSwedish Research Council, E0485301
Tilgjengelig fra: 2021-12-18 Laget: 2021-12-18 Sist oppdatert: 2023-09-05bibliografisk kontrollert
Bispo, J., Barbosa, J. G., Silva, P. F., Morales, C., Myllykoski, M., Ojeda-May, P., . . . Shoukourian, H. (2021). Best Practice Guide: Modern Accelerators.
Åpne denne publikasjonen i ny fane eller vindu >>Best Practice Guide: Modern Accelerators
Vise andre…
2021 (engelsk)Rapport (Annet vitenskapelig)
Publisher
s. 111
HSV kategori
Forskningsprogram
datalogi
Identifikatorer
urn:nbn:se:umu:diva-190729 (URN)
Tilgjengelig fra: 2021-12-22 Laget: 2021-12-22 Sist oppdatert: 2021-12-28bibliografisk kontrollert
Myllykoski, M. & Kjelgaard Mikkelsen, C. C. (2021). Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems. Concurrency and Computation, 33(11), Article ID e5915.
Åpne denne publikasjonen i ny fane eller vindu >>Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems
2021 (engelsk)Inngår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 33, nr 11, artikkel-id e5915Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

In this paper, we present the StarNEig library for solving dense nonsymmetric standard and generalized eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some components of the library have support for GPU acceleration. The library currently applies to real matrices with real and complex eigenvalues and all calculations are done using real arithmetic. Support for complex matrices is planned for a future release. This paper is aimed at potential users of the library. We describe the design choices and capabilities of the library, and contrast them to existing software such as LAPACK and ScaLAPACK. StarNEig implements a ScaLAPACK compatibility layer which should assist new users in the transition to StarNEig. We demonstrate the performance of the library with a sample of computational experiments.

sted, utgiver, år, opplag, sider
John Wiley & Sons, 2021
Emneord
eigenvalue problem, parallel computing, task‐based, numerical library
HSV kategori
Forskningsprogram
datalogi
Identifikatorer
urn:nbn:se:umu:diva-173924 (URN)10.1002/cpe.5915 (DOI)000555868500001 ()2-s2.0-85089025359 (Scopus ID)
Prosjekter
NLAFET
Tilgjengelig fra: 2020-08-06 Laget: 2020-08-06 Sist oppdatert: 2021-07-14bibliografisk kontrollert
Myllykoski, M. & Kjelgaard Mikkelsen, C. C. (2020). Introduction to StarNEig: A Task-based Library for Solving Nonsymmetric Eigenvalue Problems. In: Roman Wyrzykowski and Boleslaw Szymanski (Ed.), Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I. Paper presented at 13th International Conference on Parallel Computing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019 (pp. 70-81). Springer
Åpne denne publikasjonen i ny fane eller vindu >>Introduction to StarNEig: A Task-based Library for Solving Nonsymmetric Eigenvalue Problems
2020 (engelsk)Inngår i: Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I / [ed] Roman Wyrzykowski and Boleslaw Szymanski, Springer, 2020, s. 70-81Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Abstract. In this paper, we present the StarNEig library for solvingdense nonsymmetric (generalized) eigenvalue problems. The library isbuilt on top of the StarPU runtime system and targets both shared anddistributed memory machines. Some components of the library supportGPUs. The library is currently in an early beta state and only real arith-metic is supported. Support for complex data types is planned for afuture release. This paper is aimed at potential users of the library. Wedescribe the design choices and capabilities of the library, and contrastthem to existing software such as ScaLAPACK. StarNEig implements aScaLAPACK compatibility layer that should make it easy for new usersto transition to StarNEig. We demonstrate the performance of the librarywith a small set of computational experiments.

sted, utgiver, år, opplag, sider
Springer, 2020
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12043
Emneord
Eigenvalue problem, Task-based, Library
HSV kategori
Forskningsprogram
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-168419 (URN)10.1007/978-3-030-43229-4_7 (DOI)2-s2.0-85083964403 (Scopus ID)978-3-030-43228-7 (ISBN)978-3-030-43229-4 (ISBN)
Konferanse
13th International Conference on Parallel Computing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019
Prosjekter
NLAFET
Tilgjengelig fra: 2020-02-25 Laget: 2020-02-25 Sist oppdatert: 2023-03-23bibliografisk kontrollert
Kjelgaard Mikkelsen, C. C. & Myllykoski, M. (2020). Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils. In: Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski (Ed.), Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I. Paper presented at 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019 (pp. 58-69). Springer
Åpne denne publikasjonen i ny fane eller vindu >>Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils
2020 (engelsk)Inngår i: Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I / [ed] Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski, Springer, 2020, s. 58-69Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In this paper we consider the problem of computing generalized eigenvectors of a matrix pencil in real Schur form. In exact arithmetic, this problem can be solved using substitution. In practice, substitution is vulnerable to floating-point overflow. The robust solvers xtgevc in LAPACK prevent overflow by dynamically scaling the eigenvectors.These subroutines are scalar and sequential codes which compute theeigenvectors one by one. In this paper, we discuss how to derive robust algorithms which are blocked and parallel. The new StarNEig librarycontains a robust task-parallel solver Zazamoukh which runs on top of StarPU. Our numerical experiments show that Zazamoukh achieves a super-linear speedup compared with dtgevc for sufficiently large matrices.

sted, utgiver, år, opplag, sider
Springer, 2020
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12043
Emneord
Generalized eigenvectors, overflow protection, task-parallelism
HSV kategori
Forskningsprogram
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-168416 (URN)10.1007/978-3-030-43229-4_6 (DOI)2-s2.0-85083956421 (Scopus ID)978-3-030-43228-7 (ISBN)978-3-030-43229-4 (ISBN)
Konferanse
13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019
Prosjekter
NLAFET
Tilgjengelig fra: 2020-02-25 Laget: 2020-02-25 Sist oppdatert: 2023-03-24bibliografisk kontrollert
Myllykoski, M., Kjelgaard Mikkelsen, C. C., Schwarz, A. B. & Kågström, B. (2019). D2.7 Eigenvalue solvers for nonsymmetric problems. NLAFET Consortium; Umeå University
Åpne denne publikasjonen i ny fane eller vindu >>D2.7 Eigenvalue solvers for nonsymmetric problems
2019 (engelsk)Rapport (Annet vitenskapelig)
sted, utgiver, år, opplag, sider
NLAFET Consortium; Umeå University, 2019. s. 29
HSV kategori
Forskningsprogram
matematik; datalogi
Identifikatorer
urn:nbn:se:umu:diva-168424 (URN)
Prosjekter
NLAFET
Merknad

This work is c by the NLAFET Consortium, 2015–2019. Its duplication is allowed only for personal, educational, or research uses.

Tilgjengelig fra: 2020-02-25 Laget: 2020-02-25 Sist oppdatert: 2023-03-07bibliografisk kontrollert
Karlsson, L., Eljammaly, M. & Myllykoski, M. (2019). D6.5 Evaluation of auto-tuning techniques. NLAFET Consortium; Umeå University
Åpne denne publikasjonen i ny fane eller vindu >>D6.5 Evaluation of auto-tuning techniques
2019 (engelsk)Rapport (Annet vitenskapelig)
sted, utgiver, år, opplag, sider
NLAFET Consortium; Umeå University, 2019. s. 27
HSV kategori
Forskningsprogram
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-168425 (URN)
Prosjekter
NLAFET
Merknad

This work is c by the NLAFET Consortium, 2015–2018. Its duplication is allowed only for personal, educational, or research uses.

Tilgjengelig fra: 2020-02-25 Laget: 2020-02-25 Sist oppdatert: 2020-02-26bibliografisk kontrollert
Kågström, B., Myllykoski, M., Karlsson, L., Kjelgaard Mikkelsen, C. C., Cayrols, S., Duff, I., . . . Tissot, O. (2019). D7.8 Release of the NLAFET library. NLAFET Consortium; Umeå University
Åpne denne publikasjonen i ny fane eller vindu >>D7.8 Release of the NLAFET library
Vise andre…
2019 (engelsk)Rapport (Annet vitenskapelig)
sted, utgiver, år, opplag, sider
NLAFET Consortium; Umeå University, 2019. s. 27
HSV kategori
Forskningsprogram
matematik; datalogi
Identifikatorer
urn:nbn:se:umu:diva-168426 (URN)
Prosjekter
NLAFET
Merknad

This work is c by the NLAFET Consortium, 2015–2019. Its duplication is allowed only for personal, educational, or research uses.

Tilgjengelig fra: 2020-02-25 Laget: 2020-02-25 Sist oppdatert: 2020-02-27bibliografisk kontrollert
Myllykoski, M. (2018). A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form. In: Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski (Ed.), Parallel Processing and Applied Mathematics: PPAM 2017. Paper presented at 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017) (pp. 207-216). Springer
Åpne denne publikasjonen i ny fane eller vindu >>A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form
2018 (engelsk)Inngår i: Parallel Processing and Applied Mathematics: PPAM 2017 / [ed] Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski, Springer, 2018, s. 207-216Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

A task-based parallel algorithm for reordering the eigenvalues of a matrix in real Schur form is presented.The algorithm is realized on top of the StarPU runtime system.Only the aspects which are relevant for shared memory machines are discussed here, but the implementation can be configured to run on distributed memory machines as well.Various techniques to reduce the overhead and the core idle time are discussed.Computational experiments indicate that the new algorithm is between 1.5 and 6.6 times faster than a state of the art MPI-based implementation found in ScaLAPACK.With medium to large matrices, strong scaling efficiencies above 60\% up to 28 CPU cores are reported.The overhead and the core idle time are shown to be negligible with the exception of the smallest matrices and highest core counts.

sted, utgiver, år, opplag, sider
Springer, 2018
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10777
Emneord
Eigenvalue reordering problem, Task based programming, Shared memory machines
HSV kategori
Forskningsprogram
datalogi
Identifikatorer
urn:nbn:se:umu:diva-145987 (URN)10.1007/978-3-319-78024-5_19 (DOI)000458563300019 ()2-s2.0-85044751795 (Scopus ID)978-3-319-78023-8 (ISBN)978-3-319-78024-5 (ISBN)
Konferanse
12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017)
Tilgjengelig fra: 2018-03-24 Laget: 2018-03-24 Sist oppdatert: 2023-03-24bibliografisk kontrollert
Myllykoski, M., Karlsson, L., Kågström, B., Eljammaly, M., Pranesh, S. & Zounon, M. (2018). D2.6 Prototype Software for Eigenvalue Problem Solvers. NLAFET Consortium; Umeå University
Åpne denne publikasjonen i ny fane eller vindu >>D2.6 Prototype Software for Eigenvalue Problem Solvers
Vise andre…
2018 (engelsk)Rapport (Annet vitenskapelig)
sted, utgiver, år, opplag, sider
NLAFET Consortium; Umeå University, 2018. s. 32
HSV kategori
Forskningsprogram
matematik; datalogi
Identifikatorer
urn:nbn:se:umu:diva-170222 (URN)
Prosjekter
NLAFET
Merknad

Part of: Public Deliverables: WP2 – Dense Linear Systems and Eigenvalue Problem Solvers

Tilgjengelig fra: 2020-04-29 Laget: 2020-04-29 Sist oppdatert: 2020-05-05bibliografisk kontrollert
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-3689-0899