TY - GEN
T1 - Anatomy of a globally recursive embedded LINPACK benchmark
AU - Dongarra, Jack
AU - Luszczek, Piotr
PY - 2012
Y1 - 2012
N2 - We present a complete bottom-up implementation of an embedded LINPACK benchmark on iPad 2. We use a novel formulation of a recursive LU factorization that is recursive and parallel at the global scope. We be believe our new algorithm presents an alternative to existing linear algebra parallelization techniques such as master-worker and DAG-based approaches. We show a assembly API that allows us a much higher level of abstraction and provides rapid code development within the confines of mobile device SDK. We use performance modeling to help with the limitation of the device and the limited access to device from the development environment not geared for HPC application tuning.
AB - We present a complete bottom-up implementation of an embedded LINPACK benchmark on iPad 2. We use a novel formulation of a recursive LU factorization that is recursive and parallel at the global scope. We be believe our new algorithm presents an alternative to existing linear algebra parallelization techniques such as master-worker and DAG-based approaches. We show a assembly API that allows us a much higher level of abstraction and provides rapid code development within the confines of mobile device SDK. We use performance modeling to help with the limitation of the device and the limited access to device from the development environment not geared for HPC application tuning.
UR - http://www.scopus.com/inward/record.url?scp=84873530904&partnerID=8YFLogxK
U2 - 10.1109/HPEC.2012.6408679
DO - 10.1109/HPEC.2012.6408679
M3 - Conference contribution
AN - SCOPUS:84873530904
SN - 9781467315760
T3 - 2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012
BT - 2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012
T2 - 2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012
Y2 - 10 September 2012 through 12 September 2012
ER -