Distributed one-stage Hessenberg-triangular reduction with wavefront scheduling
2016 (English)Report (Other academic)
A novel parallel formulation of Hessenberg-triangular reduction of a regular matrix pair on distributed memory computers is presented. The formulation is based on a sequential cache-blocked algorithm by Kågstrom, Kressner, E.S. Quintana-Ortí, and G. Quintana-Ortí (2008). A static scheduling algorithm is proposed that addresses the problem of underutilized processes caused by two-sided updates of matrix pairs based on sequences of rotations. Experiments using up to 961 processes demonstrate that the new algorithm is an improvement of the state of the art but also identifies factors that currently limit its scalability.
Place, publisher, year, edition, pages
Umeå: Department of Computing Science, Umeå University , 2016. , 26 p.
Report / UMINF, ISSN 0348-0542 ; 16.10
IdentifiersURN: urn:nbn:se:umu:diva-120002OAI: oai:DiVA.org:umu-120002DiVA: diva2:926156