A novel parallel QR algorithm for hybrid distributed memory HPC systems
2010 (English)In: SIAM Journal on Scientific Computing, ISSN 1064-8275, E-ISSN 1095-7197, Vol. 32, no 4, 2345-2378 p.Article in journal (Refereed) Published
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing systems is presented. For this purpose, we introduce the concept of multiwindow bulge chain chasing and parallelize aggressive early deflation. The multiwindow approach ensures that most computations when chasing chains of bulges are performed in level 3 BLAS operations, while the aim of aggressive early deflation is to speed up the convergence of the QR algorithm. Mixed MPI-OpenMP coding techniques are utilized for porting the codes to distributed memory platforms with multithreaded nodes, such as multicore processors. Numerous numerical experiments confirm the superior performance of our parallel QR algorithm in comparison with the existing ScaLAPACK code, leading to an implementation that is one to two orders of magnitude faster for sufficiently large problems, including a number of examples from applications.
Place, publisher, year, edition, pages
Society for industrial and applied mathematics (SIAM) , 2010. Vol. 32, no 4, 2345-2378 p.
IdentifiersURN: urn:nbn:se:umu:diva-50997DOI: 10.1137/090756934ISI: 000280771100030OAI: oai:DiVA.org:umu-50997DiVA: diva2:472934