TY - GEN
T1 - Accelerating conjugate gradient using OmpSs
AU - Catalan, Sandra
AU - Martorell, Xavier
AU - Labarta, Jesus
AU - Usui, Tetsuzo
AU - Diaz, Leonel Antonio Toledo
AU - Valero-Lara, Pedro
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - In this paper, we present the benefits of using the clause concurrent of OmpSs when performing reductions, more specifically, when applied to the dot product (DOT) operations. We analyze its benefits through the implementation of different versions of the Conjugate Gradient (CG) method. We start from a parallel version of the code based on tasks and dependencies; later, we introduce the use of the concurrent clause, which allows to overlap the execution of tasks that have data dependencies among them. In this way, we want to show the benefits of the concurrent clause, which might be included in OpenMP standard as previously done with other OmpSs features. Our tests, performed on a single node of the (Intel-based) Marenostrum 4 Supercomputer and a single socket of the (ARM-based) Dibona cluster, show that the use of the concurrent clause may improve performance with respect to the version where only tasks and dependencies are used around 37% and 23% respectively.
AB - In this paper, we present the benefits of using the clause concurrent of OmpSs when performing reductions, more specifically, when applied to the dot product (DOT) operations. We analyze its benefits through the implementation of different versions of the Conjugate Gradient (CG) method. We start from a parallel version of the code based on tasks and dependencies; later, we introduce the use of the concurrent clause, which allows to overlap the execution of tasks that have data dependencies among them. In this way, we want to show the benefits of the concurrent clause, which might be included in OpenMP standard as previously done with other OmpSs features. Our tests, performed on a single node of the (Intel-based) Marenostrum 4 Supercomputer and a single socket of the (ARM-based) Dibona cluster, show that the use of the concurrent clause may improve performance with respect to the version where only tasks and dependencies are used around 37% and 23% respectively.
KW - Concurrent
KW - Conjugate gradient
KW - OmpSs
KW - Reduction
UR - http://www.scopus.com/inward/record.url?scp=85083185049&partnerID=8YFLogxK
U2 - 10.1109/PDCAT46702.2019.00033
DO - 10.1109/PDCAT46702.2019.00033
M3 - Conference contribution
AN - SCOPUS:85083185049
T3 - Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
SP - 121
EP - 126
BT - Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
A2 - Tian, Hui
A2 - Shen, Hong
A2 - Tan, Wee Lum
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
Y2 - 5 December 2019 through 7 December 2019
ER -