TY - GEN
T1 - Parallel matrix transpose algorithms on distributed memory concurrent computers
AU - Choi, Jaeyoung
AU - Dongarra, J. J.
AU - Walker, D. W.
N1 - Publisher Copyright:
© 1994 IEEE.
PY - 1993
Y1 - 1993
N2 - This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P×Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C=A·B, the algorithms are used to compute parallel multiplications of transposed matrices, C=AT·BT, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.
AB - This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P×Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C=A·B, the algorithms are used to compute parallel multiplications of transposed matrices, C=AT·BT, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.
UR - http://www.scopus.com/inward/record.url?scp=84947031567&partnerID=8YFLogxK
U2 - 10.1109/SPLC.1993.365559
DO - 10.1109/SPLC.1993.365559
M3 - Conference contribution
AN - SCOPUS:84947031567
T3 - Proceedings of Scalable Parallel Libraries Conference, SPLC 1993
SP - 245
EP - 252
BT - Proceedings of Scalable Parallel Libraries Conference, SPLC 1993
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1993 Scalable Parallel Libraries Conference, SPLC 1993
Y2 - 6 October 1993 through 8 October 1993
ER -