Project Details
Description
Leadership computing facilities for high-performance computing (HPC) have a huge investment in the file and storage systems. The reason is that the HPC storage system often is the Achilles Heel of HPC systems, as it is fraught with numerous scenarios for contention, congestion and performance variability. This problem is getting worse due to: (a) the increased importance of data-driven HPC and the growth in the amount of data generated by large-scale simulation; and (b) the slower growth of disk speed, as compared to CPU speed. The addition of high-bandwidth persistent memory devices as burst-buffers brings in new opportunities for fast caching of application data while still allowing data persistence. However, the conventional approach of exploiting burst-buffers as yet another caching layer cannot reduce the lengthy and costly data processing steps in the deep I/O stack or reconcile occasional contentions inside the complex storage system. This project, therefore, seeks to exploit burst-buffers as repositories of persistent application-specific parallel file systems, with a lifetime commensurate to the lifetime of an application or an application campaign on a HPC system. This is a collaborative project between University of Illinois at Urbana-Champaign and Florida State University.
This project formulates a research framework called Ephemeral Coherence Cohort (ECC) that offers an abstraction to represent the active collection of application data through containerization, insulate I/O activities across different applications, and enable storage disaggregation for ephemeral allocation and dynamic utilization of burst buffers. The proposed ECC framework aims to enhance a variety of mission-critical applications running on the Department of Energy and the National Science Foundation leadership computing facilities. The project strengthens the collaboration between University of Illinois Urbana-Champaign and the Florida State University. The project has plans to organize panels and birds-of-feather sessions on burst buffer research in the upcoming HPC conferences and collaborate with leaders of super-computing centers for wider community penetration with techniques from this research.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Status | Finished |
---|---|
Effective start/end date | 06/1/18 → 05/31/23 |
Funding
- National Science Foundation