Abstract
Heterogeneous computing with accelerators is growing in importance in high performance computing (HPC). Recently, application datasets have expanded beyond the memory capacity of these accelerators, and often beyond the capacity of their hosts. Meanwhile, nonvolatile memory (NVM) storage has emerged as a pervasive component in HPC systems because NVM provides massive amounts of memory capacity at affordable cost. Currently, for accelerator applications to use NVM, they must manually orchestrate data movement across multiple memories and this approach only performs well for applications with simple access behaviors. To address this issue, we developed DRAGON, a solution that enables all classes of GP-GPU applications to transparently compute on terabyte datasets residing in NVM. DRAGON leverages the page-faulting mechanism on the recent NVIDIA GPUs by extending capabilities of CUDA Unified Memory (UM). Our experimental results show that DRAGON transparently expands memory capacity and obtain additional speedups via automated I/O and data transfer overlapping.
Original language | English |
---|---|
Title of host publication | Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 414-426 |
Number of pages | 13 |
ISBN (Electronic) | 9781538683842 |
DOIs | |
State | Published - Jul 2 2018 |
Event | 2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 - Dallas, United States Duration: Nov 11 2018 → Nov 16 2018 |
Publication series
Name | Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 |
---|
Conference
Conference | 2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 11/11/18 → 11/16/18 |
Funding
This research was partially supported by JST CREST Grant Numbers JPMJCR1303 (EBD CREST) and JPMJCR1687 (DEEP CREST), and performed under the auspices of Real-World Big-Data Computation Open Innovation Laboratory (RWBC-OIL), Japan. This research was also supported in part by Oak Ridge National Laboratory ASTRO Program sponsored by the US Department of Energy and administered by the Oak Ridge Institute for Science and Education, USA. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Driver
- GPU
- Large data
- Memory
- Out-of-core