umu.sePublications
Change search
ReferencesLink to record
Permanent link

Direct link
Algorithm 953: Parallel Library Software for the Multishift QR Algorithm with Aggressive Early Deflation
Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N).
Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N). Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N).
2015 (English)In: ACM Transactions on Mathematical Software, ISSN 0098-3500, E-ISSN 1557-7295, Vol. 41, no 4, 29Article in journal (Refereed) Published
Abstract [en]

Library software implementing a parallel small-bulge multishift QR algorithm with Aggressive Early Deflation (AED) targeting distributed memory high-performance computing systems is presented. Starting from recent developments of the parallel multishift QR algorithm [Granat et al., SIAM J. Sci. Comput. 32(4), 2010], we describe a number of algorithmic and implementation improvements. These include communication avoiding algorithms via data redistribution and a refined strategy for balancing between multishift QR sweeps and AED. Guidelines concerning several important tunable algorithmic parameters are provided. As a result of these improvements, a computational bottleneck within AED has been removed in the parallel multishift QR algorithm. A performance model is established to explain the scalability behavior of the new parallel multishift QR algorithm. Numerous computational experiments confirm that our new implementation significantly outperforms previous parallel implementations of the QR algorithm.

Place, publisher, year, edition, pages
2015. Vol. 41, no 4, 29
Keyword [en]
Algorithms, Performance, Multishift QR algorithm, aggressive early deflation, parallel algorithms, stributed memory architectures
National Category
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-111765DOI: 10.1145/2699471ISI: 000363733000007ScopusID: 2-s2.0-84943313977OAI: oai:DiVA.org:umu-111765DiVA: diva2:874086
Available from: 2015-11-25 Created: 2015-11-23 Last updated: 2015-11-25Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Granat, RobertKagstrom, BoShao, Meiyue
By organisation
Department of Computing ScienceHigh Performance Computing Center North (HPC2N)
In the same journal
ACM Transactions on Mathematical Software
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 20 hits
ReferencesLink to record
Permanent link

Direct link