TY - GEN
T1 - Optimizing the qery performance of block index through data analysis and I/O modeling
AU - Wu, Tzuhsien
AU - Chou, Jerry
AU - Hao, Shyng
AU - Dong, Bin
AU - Klasky, Scot
AU - Wu, Kesheng
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/11/12
Y1 - 2017/11/12
N2 - Indexing technique has become an efficient tool to enable scientists to directly access the most relevant data records. But, the time and space requirements of building and storing indexes are expensive in the traditional approaches, such as R-tree and bitmaps. Recently, we started to address this issue by using the idea of "block index", and our previous work has shown promising results from comparing it against other well-known solutions, including ADIOS, SciDB, and FastBit. In this work, we further improve the technique from both theoretical and implementation perspectives. Driven by an extensive effort in characterizing scientific datasets and modeling I/O systems, we presented a theoretical model to analyze its query performance with respect to a given block size configuration. We also introduced three optimization techniques to achieve a 2.3x query time reduction comparing to the original implementation.
AB - Indexing technique has become an efficient tool to enable scientists to directly access the most relevant data records. But, the time and space requirements of building and storing indexes are expensive in the traditional approaches, such as R-tree and bitmaps. Recently, we started to address this issue by using the idea of "block index", and our previous work has shown promising results from comparing it against other well-known solutions, including ADIOS, SciDB, and FastBit. In this work, we further improve the technique from both theoretical and implementation perspectives. Driven by an extensive effort in characterizing scientific datasets and modeling I/O systems, we presented a theoretical model to analyze its query performance with respect to a given block size configuration. We also introduced three optimization techniques to achieve a 2.3x query time reduction comparing to the original implementation.
KW - I/O system
KW - Indexing
KW - Modeling
KW - Performance analysis
KW - Scientific data
UR - https://www.scopus.com/pages/publications/85040197205
U2 - 10.1145/3126908.3126934
DO - 10.1145/3126908.3126934
M3 - Conference contribution
AN - SCOPUS:85040197205
T3 - Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
BT - Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
PB - Association for Computing Machinery, Inc
T2 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
Y2 - 12 November 2017 through 17 November 2017
ER -