neCODEC: Nearline data compression for scientific applications

Yuan Tian, Cong Xu, Weikuan Yu, Jeffrey S. Vetter, Scott Klasky, Honggao Liu, Saad Biaz

Research output: Contribution to journalArticlepeer-review

Abstract

Advances on multicore technologies lead to processors with tens and soon hundreds of cores in a single socket, resulting in an ever growing gap between computing power and available memory and I/O bandwidths for data handling. It would be beneficial if some of the computing power can be transformed into gains of I/O efficiency, thereby reducing this speed disparity between computing and I/O. In this paper, we design and implement a NEarline data COmpression and DECompression (neCODEC) scheme for data-intensive parallel applications. Several salient techniques are introduced in neCODEC, including asynchronous compression threads, elastic file representation, distributed metadata handling, and balanced subfile distribution. Our performance evaluation indicates that neCODEC can improve the performance of a variety of data-intensive microbenchmarks and scientific applications. Particularly, neCODEC is capable of increasing the effective bandwidth of S3D, a combustion simulation code, by more than 5 times.

Original languageEnglish
Pages (from-to)475-486
Number of pages12
JournalCluster Computing
Volume17
Issue number2
DOIs
StatePublished - Jan 2014

Funding

Acknowledgements This work is funded in part by National Science Foundation awards CNS-0917137 and CNS-1059376. This research is sponsored in part by the Office of Advanced Scientific Computing Research; U.S. Department of Energy. This research is conducted with high performance computational resources provided by the Louisiana Optical Network Initiative (http://www.loni.org). We are very grateful for the technical support from the LONI team.

FundersFunder number
National Science Foundation1059376, CNS-1059376, 1320016, CNS-0917137
U.S. Department of Energy
Advanced Scientific Computing Research

    Keywords

    • Data compression
    • Lustre
    • MPI-IO

    Fingerprint

    Dive into the research topics of 'neCODEC: Nearline data compression for scientific applications'. Together they form a unique fingerprint.

    Cite this