High-performance library software for QR factorization
2001 (English)In: Applied Parallel Computing: New Paradigms for HPC in Industry and Academia. 5th International Workshop, PARA 2000 Bergen, Norway, June 18–20, 2000 Proceedings / [ed] Tor Sørevik, Fredrik Manne, Assefaw Hadish Gebremedhin, Randi Moe, Heidelberg/Berlin, Germany: Springer , 2001, Vol. 1947, 53-63 p.Conference paper (Other academic)
In ,, we presented algorithm RGEQR3, a purely recursive formulation of the QR factorization. Using recursion leads us to a natural way to choose the k-way aggregating Householder transform of Schreiber and Van Loan . RGEQR3 is a performance critical subroutine for the main (hybrid recursive) routine RGEQRF for QR factorization of a general m×n matrix. This contribution presents a new version of RGEQRF and its accompanying SMP parallel counterpart, implemented for a future release of the IBM ESSL library. It represents a robust high-performance piece of library software for QR factorization on uniprocessor and multiprocessor systems. The implementation builds on previous results ,. In particular, the new version is optimized in a number of ways to improve the performance; e.g., for small matrices and matrices with a very small number of columns. This is partly done by including mini blocking in the otherwise pure recursive RGEQR3. We describe the salient features of this implementation. Our serial implementation outperforms the corresponding LAPACK routine by 10-65% for square matrices and 10-100% on tall and thin matrices on the IBM POWER2 and POWER3 nodes. The tests covered matrix sizes which varied from very small to very large. The SMP parallel implementation shows close to perfect speedup on a 4-processor PPC604e node.
Place, publisher, year, edition, pages
Heidelberg/Berlin, Germany: Springer , 2001. Vol. 1947, 53-63 p.
, Lecture Notes in Computer Science, ISSN 0302-9743 ; 1947/2001
Serial and parallel library software, QR factorization, recursion, register blocking, unrolling, SMP systems, dynamic load balancing
IdentifiersURN: urn:nbn:se:umu:diva-40423DOI: 10.1007/3-540-70734-4_9OAI: oai:DiVA.org:umu-40423DiVA: diva2:399625
5th International Workshop, PARA 2000 Bergen, Norway, June 18–20, 2000 Proceedings