Anatomy of a globally recursive embedded LINPACK benchmark

Jack Dongarra, Piotr Luszczek

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

22 Scopus citations

Abstract

We present a complete bottom-up implementation of an embedded LINPACK benchmark on iPad 2. We use a novel formulation of a recursive LU factorization that is recursive and parallel at the global scope. We be believe our new algorithm presents an alternative to existing linear algebra parallelization techniques such as master-worker and DAG-based approaches. We show a assembly API that allows us a much higher level of abstraction and provides rapid code development within the confines of mobile device SDK. We use performance modeling to help with the limitation of the device and the limited access to device from the development environment not geared for HPC application tuning.

Original languageEnglish
Title of host publication2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012
DOIs
StatePublished - 2012
Event2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012 - Waltham, MA, United States
Duration: Sep 10 2012Sep 12 2012

Publication series

Name2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012

Conference

Conference2012 IEEE Conference on High Performance Extreme Computing, HPEC 2012
Country/TerritoryUnited States
CityWaltham, MA
Period09/10/1209/12/12

Fingerprint

Dive into the research topics of 'Anatomy of a globally recursive embedded LINPACK benchmark'. Together they form a unique fingerprint.

Cite this