Abstract
Image segmentation is a critical enabler for tasks ranging from medical diagnostics to autonomous driving. However, the correct segmentation semantics — where are boundaries located? what segments are logically similar? — change depending on the domain, such that state-of-the-art foundation models can generate meaningless and incorrect results. Moreover, in certain domains, fine-tuning and retraining techniques are infeasible: obtaining labels is costly and time-consuming; domain images (micrographs) can be exponentially diverse; and data sharing (for third-party retraining) is restricted. To enable rapid adaptation of the best segmentation technology, we propose the concept of semantic boosting: given a zero-shot foundation model, guide its segmentation and adjust results to match domain expectations. We apply semantic boosting to the Segment Anything Model (SAM) to obtain microstructure segmentation for transmission electron microscopy. Our booster, SAM-I-Am, serves as a post-processing engine that extracts geometric and textural features of various intermediate masks to perform mask removal and mask merging operations. We demonstrate a zero-shot performance increase of (absolute) +21.35%, +12.6%, +5.27% in mean IoU, and a −9.91%, −18.42%, −4.06% drop in mean false positive masks across images of three difficulty classes over vanilla SAM (ViT-L).
Original language | English |
---|---|
Article number | 113400 |
Journal | Computational Materials Science |
Volume | 246 |
DOIs | |
State | Published - Jan 2025 |
Funding
This research is supported by the U.S.Department of Energy (DOE) through the Office of Advanced Scientific Computing Research's \u201COrchestration for Distributed & Data-Intensive Scientific Exploration\u201D and the \u201CCloud, HPC, and Edge for Science and Security\u201D LDRD at Pacific Northwest National Laboratory. PNNL is operated by Battelle for the DOE under Contract DE-AC05-76RL01830.
Keywords
- Microstructure segmentation
- Segment anything model
- Semantic booster