Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image RegistrationShow others and affiliations
2022 (English)In: IEEE Transactions on Parallel and Distributed Systems, ISSN 1045-9219, E-ISSN 1558-2183, Vol. 33, no 3, p. 523-535, article id 9477174Article in journal (Refereed) Published
Abstract [en]
Parallelism patterns (e.g., map or reduce) have proven to be effective tools for parallelizing high-performance applications. In this article, we study the recursive registration of a series of electron microscopy images - a time consuming and imbalanced computation necessary for nano-scale microscopy analysis. We show that by translating the image registration into a specific instance of the prefix scan, we can convert this seemingly sequential problem into a parallel computation that scales to over thousand of cores. We analyze a variety of scan algorithms that behave similarly for common low-compute operators and propose a novel work-stealing procedure for a hierarchical prefix scan. Our evaluation shows that by identifying a suitable and well-optimized prefix scan algorithm, we reduce time-to-solution on a series of 4,096 images spanning ten seconds of microscopy acquisition from over 10 hours to less than 3 minutes (using 1024 Intel Haswell cores), enabling derivation of material properties at nanoscale for long microscopy image series.
Place, publisher, year, edition, pages
IEEE Computer Society, 2022. Vol. 33, no 3, p. 523-535, article id 9477174
Keywords [en]
image registration, load balancing, parallel algorithms, Prefix sum, work stealing
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-186545DOI: 10.1109/TPDS.2021.3095230ISI: 000679931600001Scopus ID: 2-s2.0-85111727330OAI: oai:DiVA.org:umu-186545DiVA, id: diva2:1586487
2021-08-202021-08-202023-11-10Bibliographically approved