TY - GEN
T1 - Implementing a systolic algorithm for qr factorization on multicore clusters with PaRSEC
AU - Aupy, Guillaume
AU - Faverge, Mathieu
AU - Robert, Yves
AU - Kurzak, Jakub
AU - Luszczek, Piotr
AU - Dongarra, Jack
PY - 2014
Y1 - 2014
N2 - This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for internode communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-The-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures.
AB - This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for internode communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-The-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures.
UR - https://www.scopus.com/pages/publications/84958534145
U2 - 10.1007/978-3-642-54420-0_64
DO - 10.1007/978-3-642-54420-0_64
M3 - Conference contribution
AN - SCOPUS:84958534145
SN - 9783642544194
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 657
EP - 667
BT - Euro-Par 2013
PB - Springer Verlag
T2 - 19th International Conference on Parallel Processing Workshops, Euro-Par 2013 - BigDataCloud, DIHC, FedICI, HeteroPar, HiBB, LSDVE, MHPC, OMHI, PADABS, PROPER, Resilience, ROME, and UCHPC 2013
Y2 - 26 August 2013 through 27 August 2013
ER -