Performance portable back-projection algorithms on CPUs: Agnostic data locality and vectorization optimizations

Peng Chen, Mohamed Wahib, Xiao Wang, Shinichiro Takizawa, Takahiro Hirofuchi, Hirotaka Ogawa, Satoshi Matsuoka

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Computed Tomography (CT) is a key 3D imaging technology that fundamentally relies on the compute-intense back-projection operation to generate 3D volumes. GPUs are typically used for backprojection in production CT devices. However, with the rise of power-constrained micro-CT devices, and also the emergence of CPUs comparable in performance to GPUs, back-projection for CPUs could become favorable. Unlike GPUs, extracting parallelism for back-projection algorithms on CPUs is complex given that parallelism and locality are not explicitly defined and controlled by the programmer, as is the case when using CUDA for instance. We propose a collection of novel back-projection algorithms that reduce the arithmetic computation, robustly enable vectorization, enforce a regular memory access pattern, and maximize the data locality. We also implement the novel algorithms as efficient back-projection kernels that are performance portable over a wide range of CPUs. Performance evaluation using a variety of CPUs from different vendors and generations demonstrates that our back-projection implementation achieves on average 5.2× speedup over the multi-threaded implementation of the most widely used, and optimized, open library. With a state-of-the-art CPU, we reach performance that rivals top-performing GPUs.

Original languageEnglish
Title of host publicationICS 2021 - Proceedings of the 2021 ACM International Conference on Supercomputing
PublisherAssociation for Computing Machinery
Pages316-328
Number of pages13
ISBN (Electronic)9781450383356
DOIs
StatePublished - Jun 3 2021
Externally publishedYes
Event35th ACM International Conference on Supercomputing, ICS 2021 - Virtual, Online, United States
Duration: Jun 14 2021Jun 17 2021

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference35th ACM International Conference on Supercomputing, ICS 2021
Country/TerritoryUnited States
CityVirtual, Online
Period06/14/2106/17/21

Funding

This work was supported by JSPS KAKENHI Grant Number JP21K17750. This work was partially supported by JST-CREST under Grant Number JPMJCR19F5; JST, PRESTO Grant Number JPMJPR20MA, Japan. We would like to thank Endo Lab at Tokyo Institute of Technology for providing computing resources. The author wishes to acknowledge useful discussions with Dr. Jintao Meng at Chinese Academy of Science (CAS).

Keywords

  • Computed tomography
  • Data locality
  • Vectorization

Fingerprint

Dive into the research topics of 'Performance portable back-projection algorithms on CPUs: Agnostic data locality and vectorization optimizations'. Together they form a unique fingerprint.

Cite this