TY - GEN
T1 - Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling
AU - Wu, Tzuhsien
AU - Chou, Jerry
AU - Hao, Shyng
AU - Dong, Bin
AU - Klasky, Scott
AU - Wu, Kesheng
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017
Y1 - 2017
N2 - Indexing technique has become an efficient tool to enable scientists to directly access the most relevant data records. But, the time and space requirements of building and storing indexes are expensive in the traditional approaches, such as R-tree and bitmaps. Recently, we started to address this issue by using the idea of 'block index', and our previous work has shown promising results from comparing it against other well-known solutions, including ADIOS, SciDB, and FastBit. In this work, we further improve the technique from both theoretical and implementation perspectives. Driven by an extensive effort in characterizing scientific datasets and modeling I/O systems, we presented a theoretical model to analyze its query performance with respect to a given block size configuration. We also introduced three optimization techniques to achieve a 2. 3x query time reduction comparing to the original implementation.
AB - Indexing technique has become an efficient tool to enable scientists to directly access the most relevant data records. But, the time and space requirements of building and storing indexes are expensive in the traditional approaches, such as R-tree and bitmaps. Recently, we started to address this issue by using the idea of 'block index', and our previous work has shown promising results from comparing it against other well-known solutions, including ADIOS, SciDB, and FastBit. In this work, we further improve the technique from both theoretical and implementation perspectives. Driven by an extensive effort in characterizing scientific datasets and modeling I/O systems, we presented a theoretical model to analyze its query performance with respect to a given block size configuration. We also introduced three optimization techniques to achieve a 2. 3x query time reduction comparing to the original implementation.
KW - I/O system
KW - Indexing
KW - Modeling
KW - Performance analysis
KW - Scientific data
UR - http://www.scopus.com/inward/record.url?scp=85142213596&partnerID=8YFLogxK
U2 - 10.1145/3126908.3126934
DO - 10.1145/3126908.3126934
M3 - Conference contribution
AN - SCOPUS:85142213596
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - SC 2017 - International Conference for High Performance Computing, Networking, Storage and Analysis
PB - IEEE Computer Society
T2 - 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
Y2 - 12 November 2017 through 17 November 2017
ER -