TY - GEN
T1 - Real-time High-resolution X-Ray Computed Tomography
AU - Wu, Du
AU - Chen, Peng
AU - Wang, Xiao
AU - Lyngaas, Issac
AU - Miyajima, Takaaki
AU - Endo, Toshio
AU - Matsuoka, Satoshi
AU - Wahib, Mohamed
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/5/30
Y1 - 2024/5/30
N2 - Computed Tomography (CT) serves as a key imaging technology that relies on computationally intensive filtering and back-projection algorithms for 3D image reconstruction. While conventional high-resolution image reconstruction (> 2K3) solutions provide quick results, they typically treat reconstruction as an offline workload to be performed remotely on large-scale HPC systems. The growing demand for post-construction AI-driven analytics and the need for real-time adjustments call for high-resolution reconstruction solutions that are feasible on local computing resources, i.e. a multi-GPU server at most. In this paper, we propose a novel approach that utilizes Tensor Cores to optimize image reconstruction without sacrificing precision. We also introduce a framework designed to enable real-time execution of end-to-end distributed image reconstruction in a multi-GPU environment. Evaluations conducted on a single Nvidia A100 and H100 GPU show performance improvements of 1.91 × and 2.15 × compared to highly optimized production libraries. Furthermore, our framework, when deployed on 8-card Nvidia A100 GPU system, demonstrates the ability to reconstruct real-world datasets into 20483 volumes (32 GB) in slightly more than one minute and 40963 volumes (256 GB) in 7 minutes.
AB - Computed Tomography (CT) serves as a key imaging technology that relies on computationally intensive filtering and back-projection algorithms for 3D image reconstruction. While conventional high-resolution image reconstruction (> 2K3) solutions provide quick results, they typically treat reconstruction as an offline workload to be performed remotely on large-scale HPC systems. The growing demand for post-construction AI-driven analytics and the need for real-time adjustments call for high-resolution reconstruction solutions that are feasible on local computing resources, i.e. a multi-GPU server at most. In this paper, we propose a novel approach that utilizes Tensor Cores to optimize image reconstruction without sacrificing precision. We also introduce a framework designed to enable real-time execution of end-to-end distributed image reconstruction in a multi-GPU environment. Evaluations conducted on a single Nvidia A100 and H100 GPU show performance improvements of 1.91 × and 2.15 × compared to highly optimized production libraries. Furthermore, our framework, when deployed on 8-card Nvidia A100 GPU system, demonstrates the ability to reconstruct real-world datasets into 20483 volumes (32 GB) in slightly more than one minute and 40963 volumes (256 GB) in 7 minutes.
KW - Computed Tomography
KW - GPU
KW - Tensor Cores
UR - http://www.scopus.com/inward/record.url?scp=85196279845&partnerID=8YFLogxK
U2 - 10.1145/3650200.3656634
DO - 10.1145/3650200.3656634
M3 - Conference contribution
AN - SCOPUS:85196279845
T3 - Proceedings of the International Conference on Supercomputing
SP - 110
EP - 123
BT - ICS 2024 - Proceedings of the 38th ACM International Conference on Supercomputing
PB - Association for Computing Machinery
T2 - 38th ACM International Conference on Supercomputing, ICS 2024
Y2 - 4 June 2024 through 7 June 2024
ER -