Umeå universitets logga

umu.sePublikationer
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Publikationer (10 of 13) Visa alla publikationer
Myllykoski, M. (2022). Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation. ACM Transactions on Mathematical Software, 48(1), 1-36, Article ID 11.
Öppna denna publikation i ny flik eller fönster >>Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation
2022 (Engelska)Ingår i: ACM Transactions on Mathematical Software, ISSN 0098-3500, E-ISSN 1557-7295, Vol. 48, nr 1, s. 1-36, artikel-id 11Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eigenvectors of a dense nonsymmetric matrix. This paper describes a task-based QR algorithm for reducing an upper Hessenberg matrix to real Schur form. The task-based algorithm also supports generalized eigenvalue problems (QZ algorithm) but this paper concentrates on the standard case. The task-based algorithm adopts previous algorithmic improvements, such as tightly-coupled multi-shifts and Aggressive Early Deflation (AED), and also incorporates several new ideas that significantly improve the performance. This includes, but is not limited to, the elimination of several synchronization points, the dynamic merging of previously separate computational steps, the shortening and the prioritization of the critical path, and experimental GPU support. The task-based implementation is demonstrated to be multiple times faster than multi-threaded LAPACK and ScaLAPACK in both single-node and multi-node configurations on two different machines based on Intel and AMD CPUs. The implementation is built on top of the StarPU runtime system and is part of the open-source StarNEig library.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM), 2022
Nyckelord
aggressive early deflation, MPI, multi-shift, QR algorithm, QZ algorithm, real Schur form, distributed memory, StarPU, shared memory, task-based, Eigenvalue problem, GPU
Nationell ämneskategori
Datavetenskap (datalogi) Beräkningsmatematik
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:umu:diva-190558 (URN)10.1145/3495005 (DOI)000759468700012 ()2-s2.0-85125191396 (Scopus ID)
Forskningsfinansiär
EU, Horisont 2020, 671633eSSENCE - An eScience CollaborationVetenskapsrådet, E0485301
Tillgänglig från: 2021-12-18 Skapad: 2021-12-18 Senast uppdaterad: 2023-09-05Bibliografiskt granskad
Bispo, J., Barbosa, J. G., Silva, P. F., Morales, C., Myllykoski, M., Ojeda-May, P., . . . Shoukourian, H. (2021). Best Practice Guide: Modern Accelerators.
Öppna denna publikation i ny flik eller fönster >>Best Practice Guide: Modern Accelerators
Visa övriga...
2021 (Engelska)Rapport (Övrigt vetenskapligt)
Förlag
s. 111
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:umu:diva-190729 (URN)
Tillgänglig från: 2021-12-22 Skapad: 2021-12-22 Senast uppdaterad: 2021-12-28Bibliografiskt granskad
Myllykoski, M. & Kjelgaard Mikkelsen, C. C. (2021). Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems. Concurrency and Computation, 33(11), Article ID e5915.
Öppna denna publikation i ny flik eller fönster >>Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems
2021 (Engelska)Ingår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 33, nr 11, artikel-id e5915Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In this paper, we present the StarNEig library for solving dense nonsymmetric standard and generalized eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some components of the library have support for GPU acceleration. The library currently applies to real matrices with real and complex eigenvalues and all calculations are done using real arithmetic. Support for complex matrices is planned for a future release. This paper is aimed at potential users of the library. We describe the design choices and capabilities of the library, and contrast them to existing software such as LAPACK and ScaLAPACK. StarNEig implements a ScaLAPACK compatibility layer which should assist new users in the transition to StarNEig. We demonstrate the performance of the library with a sample of computational experiments.

Ort, förlag, år, upplaga, sidor
John Wiley & Sons, 2021
Nyckelord
eigenvalue problem, parallel computing, task‐based, numerical library
Nationell ämneskategori
Datavetenskap (datalogi) Beräkningsmatematik
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:umu:diva-173924 (URN)10.1002/cpe.5915 (DOI)000555868500001 ()2-s2.0-85089025359 (Scopus ID)
Projekt
NLAFET
Tillgänglig från: 2020-08-06 Skapad: 2020-08-06 Senast uppdaterad: 2021-07-14Bibliografiskt granskad
Myllykoski, M. & Kjelgaard Mikkelsen, C. C. (2020). Introduction to StarNEig: A Task-based Library for Solving Nonsymmetric Eigenvalue Problems. In: Roman Wyrzykowski and Boleslaw Szymanski (Ed.), Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I. Paper presented at 13th International Conference on Parallel Computing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019 (pp. 70-81). Springer
Öppna denna publikation i ny flik eller fönster >>Introduction to StarNEig: A Task-based Library for Solving Nonsymmetric Eigenvalue Problems
2020 (Engelska)Ingår i: Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I / [ed] Roman Wyrzykowski and Boleslaw Szymanski, Springer, 2020, s. 70-81Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Abstract. In this paper, we present the StarNEig library for solvingdense nonsymmetric (generalized) eigenvalue problems. The library isbuilt on top of the StarPU runtime system and targets both shared anddistributed memory machines. Some components of the library supportGPUs. The library is currently in an early beta state and only real arith-metic is supported. Support for complex data types is planned for afuture release. This paper is aimed at potential users of the library. Wedescribe the design choices and capabilities of the library, and contrastthem to existing software such as ScaLAPACK. StarNEig implements aScaLAPACK compatibility layer that should make it easy for new usersto transition to StarNEig. We demonstrate the performance of the librarywith a small set of computational experiments.

Ort, förlag, år, upplaga, sidor
Springer, 2020
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12043
Nyckelord
Eigenvalue problem, Task-based, Library
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-168419 (URN)10.1007/978-3-030-43229-4_7 (DOI)2-s2.0-85083964403 (Scopus ID)978-3-030-43228-7 (ISBN)978-3-030-43229-4 (ISBN)
Konferens
13th International Conference on Parallel Computing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019
Projekt
NLAFET
Tillgänglig från: 2020-02-25 Skapad: 2020-02-25 Senast uppdaterad: 2023-03-23Bibliografiskt granskad
Kjelgaard Mikkelsen, C. C. & Myllykoski, M. (2020). Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils. In: Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski (Ed.), Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I. Paper presented at 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019 (pp. 58-69). Springer
Öppna denna publikation i ny flik eller fönster >>Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils
2020 (Engelska)Ingår i: Parallel Processing and Applied Mathematics: Revised Selected Papers, Part I / [ed] Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski, Springer, 2020, s. 58-69Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper we consider the problem of computing generalized eigenvectors of a matrix pencil in real Schur form. In exact arithmetic, this problem can be solved using substitution. In practice, substitution is vulnerable to floating-point overflow. The robust solvers xtgevc in LAPACK prevent overflow by dynamically scaling the eigenvectors.These subroutines are scalar and sequential codes which compute theeigenvectors one by one. In this paper, we discuss how to derive robust algorithms which are blocked and parallel. The new StarNEig librarycontains a robust task-parallel solver Zazamoukh which runs on top of StarPU. Our numerical experiments show that Zazamoukh achieves a super-linear speedup compared with dtgevc for sufficiently large matrices.

Ort, förlag, år, upplaga, sidor
Springer, 2020
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12043
Nyckelord
Generalized eigenvectors, overflow protection, task-parallelism
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-168416 (URN)10.1007/978-3-030-43229-4_6 (DOI)2-s2.0-85083956421 (Scopus ID)978-3-030-43228-7 (ISBN)978-3-030-43229-4 (ISBN)
Konferens
13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, Bialystok, Poland, September 8-11, 2019
Projekt
NLAFET
Tillgänglig från: 2020-02-25 Skapad: 2020-02-25 Senast uppdaterad: 2023-03-24Bibliografiskt granskad
Myllykoski, M., Kjelgaard Mikkelsen, C. C., Schwarz, A. B. & Kågström, B. (2019). D2.7 Eigenvalue solvers for nonsymmetric problems. NLAFET Consortium; Umeå University
Öppna denna publikation i ny flik eller fönster >>D2.7 Eigenvalue solvers for nonsymmetric problems
2019 (Engelska)Rapport (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
NLAFET Consortium; Umeå University, 2019. s. 29
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
matematik; datalogi
Identifikatorer
urn:nbn:se:umu:diva-168424 (URN)
Projekt
NLAFET
Anmärkning

This work is c by the NLAFET Consortium, 2015–2019. Its duplication is allowed only for personal, educational, or research uses.

Tillgänglig från: 2020-02-25 Skapad: 2020-02-25 Senast uppdaterad: 2023-03-07Bibliografiskt granskad
Karlsson, L., Eljammaly, M. & Myllykoski, M. (2019). D6.5 Evaluation of auto-tuning techniques. NLAFET Consortium; Umeå University
Öppna denna publikation i ny flik eller fönster >>D6.5 Evaluation of auto-tuning techniques
2019 (Engelska)Rapport (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
NLAFET Consortium; Umeå University, 2019. s. 27
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-168425 (URN)
Projekt
NLAFET
Anmärkning

This work is c by the NLAFET Consortium, 2015–2018. Its duplication is allowed only for personal, educational, or research uses.

Tillgänglig från: 2020-02-25 Skapad: 2020-02-25 Senast uppdaterad: 2020-02-26Bibliografiskt granskad
Kågström, B., Myllykoski, M., Karlsson, L., Kjelgaard Mikkelsen, C. C., Cayrols, S., Duff, I., . . . Tissot, O. (2019). D7.8 Release of the NLAFET library. NLAFET Consortium; Umeå University
Öppna denna publikation i ny flik eller fönster >>D7.8 Release of the NLAFET library
Visa övriga...
2019 (Engelska)Rapport (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
NLAFET Consortium; Umeå University, 2019. s. 27
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
matematik; datalogi
Identifikatorer
urn:nbn:se:umu:diva-168426 (URN)
Projekt
NLAFET
Anmärkning

This work is c by the NLAFET Consortium, 2015–2019. Its duplication is allowed only for personal, educational, or research uses.

Tillgänglig från: 2020-02-25 Skapad: 2020-02-25 Senast uppdaterad: 2020-02-27Bibliografiskt granskad
Myllykoski, M. (2018). A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form. In: Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski (Ed.), Parallel Processing and Applied Mathematics: PPAM 2017. Paper presented at 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017) (pp. 207-216). Springer
Öppna denna publikation i ny flik eller fönster >>A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form
2018 (Engelska)Ingår i: Parallel Processing and Applied Mathematics: PPAM 2017 / [ed] Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski, Springer, 2018, s. 207-216Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

A task-based parallel algorithm for reordering the eigenvalues of a matrix in real Schur form is presented.The algorithm is realized on top of the StarPU runtime system.Only the aspects which are relevant for shared memory machines are discussed here, but the implementation can be configured to run on distributed memory machines as well.Various techniques to reduce the overhead and the core idle time are discussed.Computational experiments indicate that the new algorithm is between 1.5 and 6.6 times faster than a state of the art MPI-based implementation found in ScaLAPACK.With medium to large matrices, strong scaling efficiencies above 60\% up to 28 CPU cores are reported.The overhead and the core idle time are shown to be negligible with the exception of the smallest matrices and highest core counts.

Ort, förlag, år, upplaga, sidor
Springer, 2018
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10777
Nyckelord
Eigenvalue reordering problem, Task based programming, Shared memory machines
Nationell ämneskategori
Datavetenskap (datalogi) Beräkningsmatematik
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:umu:diva-145987 (URN)10.1007/978-3-319-78024-5_19 (DOI)000458563300019 ()2-s2.0-85044751795 (Scopus ID)978-3-319-78023-8 (ISBN)978-3-319-78024-5 (ISBN)
Konferens
12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017)
Tillgänglig från: 2018-03-24 Skapad: 2018-03-24 Senast uppdaterad: 2023-03-24Bibliografiskt granskad
Myllykoski, M., Karlsson, L., Kågström, B., Eljammaly, M., Pranesh, S. & Zounon, M. (2018). D2.6 Prototype Software for Eigenvalue Problem Solvers. NLAFET Consortium; Umeå University
Öppna denna publikation i ny flik eller fönster >>D2.6 Prototype Software for Eigenvalue Problem Solvers
Visa övriga...
2018 (Engelska)Rapport (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
NLAFET Consortium; Umeå University, 2018. s. 32
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
matematik; datalogi
Identifikatorer
urn:nbn:se:umu:diva-170222 (URN)
Projekt
NLAFET
Anmärkning

Part of: Public Deliverables: WP2 – Dense Linear Systems and Eigenvalue Problem Solvers

Tillgänglig från: 2020-04-29 Skapad: 2020-04-29 Senast uppdaterad: 2020-05-05Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-3689-0899

Sök vidare i DiVA

Visa alla publikationer