New serial and parallel recursive QR factorization algorithms for SMP systems
1998 (English)In: Applied parallel computing: large scale scientific and industrial problems: 4th international workshop, PARA '98, Umeå, Sweden, June 14-17, 1998 : proceedings / [ed] Bo Kågström, Jack Dongarra, Erik Elmroth, Jerzy Wasniewski, Heidelberg/Berlin, Germany: Springer , 1998, Vol. 1541, 120-128 p.Conference paper (Other academic)
We present a new recursive algorithm for the QR factorization of an m by n matrix A. The recursion leads to an automatic variable blocking that allow us to replace a level 2 part in a standard block algorithm by level 3 operations. However, there are some additional costs for performing the updates which prohibits the efficient use of the recursion for large n. This obstacle is overcome by using a hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by 78% to 21% as m=n increases from 100 to 1000. A successful parallel implementation on a PowerPC 604 based IBM SMP node based on dynamic load balancing is presented. For 2, 3, 4 processors and m=n=2000 it shows speedups of 1.96, 2.99, and 3.92 compared to our uniprocessor algorithm.
Place, publisher, year, edition, pages
Heidelberg/Berlin, Germany: Springer , 1998. Vol. 1541, 120-128 p.
, Lecture Notes in Computer Science, ISSN 0302-9743 ; 1541/1998
IdentifiersURN: urn:nbn:se:umu:diva-40430DOI: 10.1007/BFb0095328ISBN: 3-540-65414-3OAI: oai:DiVA.org:umu-40430DiVA: diva2:399645
4th International Workshop, PARA’98 Umeå, Sweden, June 14–17