TY - GEN
T1 - Toward Large-Scale Image Segmentation on Summit
AU - Seal, Sudip K.
AU - Lim, Seung Hwan
AU - Wang, Dali
AU - Hinkle, Jacob
AU - Lunga, Dalton
AU - Tsaris, Aristeidis
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/8/17
Y1 - 2020/8/17
N2 - Semantic segmentation of images is an important computer vision task that emerges in a variety of application domains such as medical imaging, robotic vision and autonomous vehicles to name a few. While these domain-specific image analysis tasks involve relatively small image sizes (∼102 × 102), there are many applications that need to train machine learning models on image data with extents that are orders of magnitude larger (∼104 × 104). Training deep neural network (DNN) models on large extent images is extremely memory-intensive and often exceeds the memory limitations of a single graphical processing unit, a hardware accelerator of choice for computer vision workloads. Here, an efficient, sample parallel approach to train U-Net models on large extent image data sets is presented. Its advantages and limitations are analyzed and near-linear strong-scaling speedup demonstrated on 256 nodes (1536 GPUs) of the Summit supercomputer. Using a single node of the Summit supercomputer, an early evaluation of a recently released model parallel framework called GPipe is demonstrated to deliver ∼2X speedup in executing a U-Net model with an order of magnitude larger number of trainable parameters than reported before. Performance bottlenecks for pipelined training of U-Net models are identified and mitigation strategies to improve the speedups are discussed. Together, these results open up the possibility of combining both approaches into a unified scalable pipelined and data parallel algorithm to efficiently train U-Net models with very large receptive fields on data sets of ultra-large extent images.
AB - Semantic segmentation of images is an important computer vision task that emerges in a variety of application domains such as medical imaging, robotic vision and autonomous vehicles to name a few. While these domain-specific image analysis tasks involve relatively small image sizes (∼102 × 102), there are many applications that need to train machine learning models on image data with extents that are orders of magnitude larger (∼104 × 104). Training deep neural network (DNN) models on large extent images is extremely memory-intensive and often exceeds the memory limitations of a single graphical processing unit, a hardware accelerator of choice for computer vision workloads. Here, an efficient, sample parallel approach to train U-Net models on large extent image data sets is presented. Its advantages and limitations are analyzed and near-linear strong-scaling speedup demonstrated on 256 nodes (1536 GPUs) of the Summit supercomputer. Using a single node of the Summit supercomputer, an early evaluation of a recently released model parallel framework called GPipe is demonstrated to deliver ∼2X speedup in executing a U-Net model with an order of magnitude larger number of trainable parameters than reported before. Performance bottlenecks for pipelined training of U-Net models are identified and mitigation strategies to improve the speedups are discussed. Together, these results open up the possibility of combining both approaches into a unified scalable pipelined and data parallel algorithm to efficiently train U-Net models with very large receptive fields on data sets of ultra-large extent images.
KW - U-Net
KW - applied machine learning
KW - deep neural networks
KW - image segmentation
KW - model parallel
KW - pipeline.
KW - scalable data analytics
UR - http://www.scopus.com/inward/record.url?scp=85090574972&partnerID=8YFLogxK
U2 - 10.1145/3404397.3404468
DO - 10.1145/3404397.3404468
M3 - Conference contribution
AN - SCOPUS:85090574972
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 49th International Conference on Parallel Processing, ICPP 2020
PB - Association for Computing Machinery
T2 - 49th International Conference on Parallel Processing, ICPP 2020
Y2 - 17 August 2020 through 20 August 2020
ER -