Abstract
Semantic segmentation of images is an important computer vision task that emerges in a variety of application domains such as medical imaging, robotic vision and autonomous vehicles to name a few. While these domain-specific image analysis tasks involve relatively small image sizes (∼102 × 102), there are many applications that need to train machine learning models on image data with extents that are orders of magnitude larger (∼104 × 104). Training deep neural network (DNN) models on large extent images is extremely memory-intensive and often exceeds the memory limitations of a single graphical processing unit, a hardware accelerator of choice for computer vision workloads. Here, an efficient, sample parallel approach to train U-Net models on large extent image data sets is presented. Its advantages and limitations are analyzed and near-linear strong-scaling speedup demonstrated on 256 nodes (1536 GPUs) of the Summit supercomputer. Using a single node of the Summit supercomputer, an early evaluation of a recently released model parallel framework called GPipe is demonstrated to deliver ∼2X speedup in executing a U-Net model with an order of magnitude larger number of trainable parameters than reported before. Performance bottlenecks for pipelined training of U-Net models are identified and mitigation strategies to improve the speedups are discussed. Together, these results open up the possibility of combining both approaches into a unified scalable pipelined and data parallel algorithm to efficiently train U-Net models with very large receptive fields on data sets of ultra-large extent images.
Original language | English |
---|---|
Title of host publication | Proceedings of the 49th International Conference on Parallel Processing, ICPP 2020 |
Publisher | Association for Computing Machinery |
ISBN (Electronic) | 9781450388160 |
DOIs | |
State | Published - Aug 17 2020 |
Externally published | Yes |
Event | 49th International Conference on Parallel Processing, ICPP 2020 - Virtual, Online, Canada Duration: Aug 17 2020 → Aug 20 2020 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 49th International Conference on Parallel Processing, ICPP 2020 |
---|---|
Country/Territory | Canada |
City | Virtual, Online |
Period | 08/17/20 → 08/20/20 |
Funding
This research is sponsored by the AI Initiative as part of the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. This research used resources at the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility operated by the Oak Ridge National Laboratory. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://energy.gov/downloads/doe-public-access-plan).
Keywords
- U-Net
- applied machine learning
- deep neural networks
- image segmentation
- model parallel
- pipeline.
- scalable data analytics