TY - GEN
T1 - Multi-core cluster implementation of SIRT with application to cone beam micro-CT
AU - Gregor, Jens
AU - Lenox, Mark
AU - Bingham, Philip
AU - Arrowood, Lloyd
PY - 2009
Y1 - 2009
N2 - Iterative x-ray CT algorithms produce high-quality images but potentially do so at a prohibitive computational cost when applied to large problems. We describe how to implement SIRT, a well-known weighted least squares algorithm, for parallel execution using a commodity cluster of networked multi-core PCs. Algorithmic strategies include near-optimal relaxation which eliminates half of the iterations needed, scalar preconditioning which reduces the number of global reductions, orthogonalized ordered subsets which greatly increases the rate of convergence, and focus of attention which reduces the overall problem size in a data-driven manner. Implementation strategies include a workload distribution scheme which provides each core with mutex-free access to its local shared memory, as well as a modification thereof that leads to a balanced workload for the entire cluster. We illustrate the efficacy of the above scalable approach by providing experimental results for a cone beam micro-CT mouse data set.
AB - Iterative x-ray CT algorithms produce high-quality images but potentially do so at a prohibitive computational cost when applied to large problems. We describe how to implement SIRT, a well-known weighted least squares algorithm, for parallel execution using a commodity cluster of networked multi-core PCs. Algorithmic strategies include near-optimal relaxation which eliminates half of the iterations needed, scalar preconditioning which reduces the number of global reductions, orthogonalized ordered subsets which greatly increases the rate of convergence, and focus of attention which reduces the overall problem size in a data-driven manner. Implementation strategies include a workload distribution scheme which provides each core with mutex-free access to its local shared memory, as well as a modification thereof that leads to a balanced workload for the entire cluster. We illustrate the efficacy of the above scalable approach by providing experimental results for a cone beam micro-CT mouse data set.
UR - http://www.scopus.com/inward/record.url?scp=77951160150&partnerID=8YFLogxK
U2 - 10.1109/NSSMIC.2009.5402342
DO - 10.1109/NSSMIC.2009.5402342
M3 - Conference contribution
AN - SCOPUS:77951160150
SN - 9781424439621
T3 - IEEE Nuclear Science Symposium Conference Record
SP - 4120
EP - 4125
BT - 2009 IEEE Nuclear Science Symposium Conference Record, NSS/MIC 2009
T2 - 2009 IEEE Nuclear Science Symposium Conference Record, NSS/MIC 2009
Y2 - 25 October 2009 through 31 October 2009
ER -