TY - GEN
T1 - Framework for mapping data mining applications on GPUs
AU - Gainaru, Ana
AU - Slusanschi, Emil
PY - 2011
Y1 - 2011
N2 - Data mining algorithms are expensive by nature, but when dealing with today's dataset sizes, they are becoming even more slow and hard to use. Previous work has focused on parallelizing data mining algorithms on different architectures, and more recently, applications are starting to take advantage of the massive computation power and high bandwidth offered by GPUs. However there has been almost no prior work in offering a general methodology for parallelizing all types of data mining applications on hybrid architectures. This paper presents a framework for fast and efficient parallelization of data mining algorithms on GPU systems. The framework implements I/O transfer models that deal with the huge amount of data entries which are processed by this type of algorithms, all with numerous dependencies. Also the framework allows users to specify data requirements for each task so that the data scheduler can map efficiently each task on a GPU node and on a block in each of these processors improving the overall performance of the algorithm with around 20%.
AB - Data mining algorithms are expensive by nature, but when dealing with today's dataset sizes, they are becoming even more slow and hard to use. Previous work has focused on parallelizing data mining algorithms on different architectures, and more recently, applications are starting to take advantage of the massive computation power and high bandwidth offered by GPUs. However there has been almost no prior work in offering a general methodology for parallelizing all types of data mining applications on hybrid architectures. This paper presents a framework for fast and efficient parallelization of data mining algorithms on GPU systems. The framework implements I/O transfer models that deal with the huge amount of data entries which are processed by this type of algorithms, all with numerous dependencies. Also the framework allows users to specify data requirements for each task so that the data scheduler can map efficiently each task on a GPU node and on a block in each of these processors improving the overall performance of the algorithm with around 20%.
KW - GPU
KW - data mining applications
KW - parallelization
UR - http://www.scopus.com/inward/record.url?scp=84863328018&partnerID=8YFLogxK
U2 - 10.1109/ISPDC.2011.20
DO - 10.1109/ISPDC.2011.20
M3 - Conference contribution
AN - SCOPUS:84863328018
SN - 9780769545400
T3 - Proceedings - 2011 10th International Symposium on Parallel and Distributed Computing, ISPDC 2011
SP - 71
EP - 78
BT - Proceedings - 2011 10th International Symposium on Parallel and Distributed Computing, ISPDC 2011
T2 - 2011 10th International Symposium on Parallel and Distributed Computing, ISPDC 2011
Y2 - 6 July 2011 through 8 July 2011
ER -