TY - JOUR
T1 - Efficient Distributed Sequence Parallelism for Transformer-based Image Segmentation
AU - Lyngaas, Isaac
AU - Meena, Murali Gopalakrishnan
AU - Calabrese, Evan
AU - Wahib, Mohamed
AU - Chen, Peng
AU - Igarashi, Jun
AU - Huo, Yuankai
AU - Wang, Xiao
N1 - Publisher Copyright:
© 2024, Society for Imaging Science and Technology.
PY - 2024
Y1 - 2024
N2 - We introduce an efficient distributed sequence parallel approach for training transformer-based deep learning image segmentation models. The neural network models are comprised of a combination of a Vision Transformer encoder with a convolutional decoder to provide image segmentation mappings. The utility of the distributed sequence parallel approach is especially useful in cases where the tokenized embedding representation of image data are too large to fit into standard computing hardware memory. To demonstrate the performance and characteristics of our models trained in sequence parallel fashion compared to standard models, we evaluate our approach using a 3D MRI brain tumor segmentation dataset. We show that training with a sequence parallel approach can match standard sequential model training in terms of convergence. Furthermore, we show that our sequence parallel approach has the capability to support training of models that would not be possible on standard computing resources.
AB - We introduce an efficient distributed sequence parallel approach for training transformer-based deep learning image segmentation models. The neural network models are comprised of a combination of a Vision Transformer encoder with a convolutional decoder to provide image segmentation mappings. The utility of the distributed sequence parallel approach is especially useful in cases where the tokenized embedding representation of image data are too large to fit into standard computing hardware memory. To demonstrate the performance and characteristics of our models trained in sequence parallel fashion compared to standard models, we evaluate our approach using a 3D MRI brain tumor segmentation dataset. We show that training with a sequence parallel approach can match standard sequential model training in terms of convergence. Furthermore, we show that our sequence parallel approach has the capability to support training of models that would not be possible on standard computing resources.
UR - http://www.scopus.com/inward/record.url?scp=85198743244&partnerID=8YFLogxK
U2 - 10.2352/EI.2024.36.12.HPCI-199
DO - 10.2352/EI.2024.36.12.HPCI-199
M3 - Conference article
AN - SCOPUS:85198743244
SN - 2470-1173
VL - 36
SP - 1991
EP - 1997
JO - IS and T International Symposium on Electronic Imaging Science and Technology
JF - IS and T International Symposium on Electronic Imaging Science and Technology
IS - 12
T2 - IS and T International Symposium on Electronic Imaging 2024: High Performance Computing for Imaging 2024
Y2 - 21 January 2024 through 25 January 2024
ER -