TY - GEN
T1 - Image Gradient Decomposition for Parallel and Memory-Efficient Ptychographic Reconstruction
AU - Wang, Xiao
AU - Tsaris, Aristeidis
AU - Mukherjee, Debangshu
AU - Wahib, Mohamed
AU - Chen, Peng
AU - Oxley, Mark
AU - Ovchinnikova, Olga
AU - Hinkle, Jacob
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Ptychography is a popular microscopic imaging modality for many scientific discoveries and sets the record for highest image resolution. Unfortunately, the high image resolution for ptychographic reconstruction requires significant amount of memory and computations, forcing many applications to compromise their image resolution in exchange for a smaller memory footprint and a shorter reconstruction time. In this paper, we propose a novel image gradient decomposition method that significantly reduces the memory footprint for ptychographic reconstruction by tessellating image gradients and diffraction measurements into tiles. In addition, we propose a parallel image gradient decomposition method that enables asynchronous point-to-point communications and parallel pipelining with minimal overhead on a large number of GPUs. Our experiments on a Titanate material dataset (PbTiO3) with 16632 probe locations show that our Gradient Decomposition algorithm reduces memory footprint by 51 times. In addition, it achieves time-to-solution within 2.2 minutes by scaling to 4158 GPUs with a super-linear strong scaling efficiency at 364% compared to runtimes at 6 GPUs. This performance is 2.7 times more memory efficient, 9 times more scalable and 86 times faster than the state-of-the-art algorithm.
AB - Ptychography is a popular microscopic imaging modality for many scientific discoveries and sets the record for highest image resolution. Unfortunately, the high image resolution for ptychographic reconstruction requires significant amount of memory and computations, forcing many applications to compromise their image resolution in exchange for a smaller memory footprint and a shorter reconstruction time. In this paper, we propose a novel image gradient decomposition method that significantly reduces the memory footprint for ptychographic reconstruction by tessellating image gradients and diffraction measurements into tiles. In addition, we propose a parallel image gradient decomposition method that enables asynchronous point-to-point communications and parallel pipelining with minimal overhead on a large number of GPUs. Our experiments on a Titanate material dataset (PbTiO3) with 16632 probe locations show that our Gradient Decomposition algorithm reduces memory footprint by 51 times. In addition, it achieves time-to-solution within 2.2 minutes by scaling to 4158 GPUs with a super-linear strong scaling efficiency at 364% compared to runtimes at 6 GPUs. This performance is 2.7 times more memory efficient, 9 times more scalable and 86 times faster than the state-of-the-art algorithm.
KW - electron microscopy
KW - high performance computing
KW - image reconstruction
KW - parallel partitioning
UR - http://www.scopus.com/inward/record.url?scp=85149309742&partnerID=8YFLogxK
U2 - 10.1109/SC41404.2022.00013
DO - 10.1109/SC41404.2022.00013
M3 - Conference contribution
AN - SCOPUS:85149309742
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2022
PB - IEEE Computer Society
T2 - 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
Y2 - 13 November 2022 through 18 November 2022
ER -