Abstract
Attention-based models are proliferating in the space of image analytics, including segmentation. The standard method of feeding images to transformer encoders is to divide the images into patches and then feed the patches to the model as a linear sequence of tokens. For high-resolution images, e.g. microscopic pathology images, the quadratic compute and memory cost prohibits the use of an attention-based model, if we are to use smaller patch sizes that are favorable in segmentation. The solution is to either use custom complex multi-resolution models or approximate attention schemes. We take inspiration from Adapative Mesh Refinement (AMR) methods in HPC by adaptively patching the images, as a pre-processing step, based on the image details to reduce the number of patches being fed to the model, by orders of magnitude. This method has a negligible overhead, and works seamlessly with any attention-based model, i.e. it is a pre-processing step that can be adopted by any attention-based model without friction. We demonstrate superior segmentation quality over SoTA segmentation models for real-world pathology datasets while gaining a geomean speedup of 6.9 × for resolutions up to 64 K2, on up to 2,048 GPUs.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of SC 2024 |
| Subtitle of host publication | International Conference for High Performance Computing, Networking, Storage and Analysis |
| Publisher | IEEE Computer Society |
| ISBN (Electronic) | 9798350352917 |
| DOIs | |
| State | Published - 2024 |
| Event | 2024 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2024 - Atlanta, United States Duration: Nov 17 2024 → Nov 22 2024 |
Publication series
| Name | International Conference for High Performance Computing, Networking, Storage and Analysis, SC |
|---|---|
| ISSN (Print) | 2167-4329 |
| ISSN (Electronic) | 2167-4337 |
Conference
| Conference | 2024 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2024 |
|---|---|
| Country/Territory | United States |
| City | Atlanta |
| Period | 11/17/24 → 11/22/24 |
Funding
The AI-Compliant Advanced Computer System Joint Research Project 2022 Information Initiative Center, Hokkaido University, Sapporo, Japan, partly supported the work. JST SPRING Grant Number JPMJSP2119 also supported this work. This work was supported by JSPS KAKENHI under Grant Number JP21K17750. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory (ORNL), which is supported by the Office of Science of the U.S. Department of Energy (DOE) under Contract No. DE-AC05-00OR22725. This manuscript has been coauthored by ORNL, operated by UT-Battelle, LLC with the U.S.Department of Energy. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government.