Parallel and Cache-Efficient In-Place Matrix Storage Format Conversion
2012 (English)In: ACM Transactions on Mathematical Software, ISSN 0098-3500, Vol. 38, no 3, 17:1-17:32 p.Article in journal (Refereed) Published
Techniques and algorithms for efficient in-place conversion to and from standard and blocked matrix storage formats are described. Such functionality is required by numerical libraries that use different data layouts internally. Parallel algorithms and a software package for in-place matrix storage format conversion based on in-place matrix transposition are presented and evaluated. A new algorithm for in-place transposition which efficiently determines the structure of the transposition permutation a priori is one of the key ingredients. It enables effective load balancing in a parallel environment.
Place, publisher, year, edition, pages
New York: Association for Computing Machinery , 2012. Vol. 38, no 3, 17:1-17:32 p.
Algorithms, Performance, Theory, Blocked matrix data layout, in-place matrix transposition, parallel and cache-efficient algorithms
Computer and Information Science Mathematics
IdentifiersURN: urn:nbn:se:umu:diva-56166DOI: 10.1145/2168773.2168775ISI: 000303654900002OAI: oai:DiVA.org:umu-56166DiVA: diva2:545374