Design and performance modeling of parallel block matrix factorizations for distributed memory multicomputers
1992 (English)In: Proceedings of the Industrial Mathematics Week, 1992, 102-116 p.Conference paper (Refereed)
Efficient and scalable parallel block algorithms for the LU factorization with partial pivoting, the Cholesky, and QR factorizations in a distributed memory multicomputer environment are presented. The distributed system is viewed as a ring of processors and the algorithms correspond to shared memory algorithms parallelized on block level (explicit parallelism). Performance of the algorithms are analyzed theoretically and illustrated empirically by implementations on the Intel iPSC/2 hypercube. A model predicting performance and optimal block size is presented.
Place, publisher, year, edition, pages
1992. 102-116 p.
block matrix factorizations, distributed memory multicomputers, performance modeling
IdentifiersURN: urn:nbn:se:umu:diva-40436OAI: oai:DiVA.org:umu-40436DiVA: diva2:399673