TY - GEN
T1 - The impact of multicore on math software
AU - Buttari, Alfredo
AU - Dongarra, Jack
AU - Kurzak, Jakub
AU - Langou, Julien
AU - Luszczek, Piotr
AU - Tomov, Stanimire
PY - 2007
Y1 - 2007
N2 - Power consumption and heat dissipation issues are pushing the microprocessors industry towards multicore design patterns. Given the cubic dependence between core frequency and power consumption, multicore technologies leverage the idea that doubling the number of cores and halving the cores frequency gives roughly the same performance reducing the power consumption by a factor of four. With the number of cores on multicore chips expected to reach tens in a few years, efficient implementations of numerical libraries using shared memory programming models is of high interest. The current message passing paradigm used in ScaLAPACK and elsewhere introduces unnecessary memory overhead and memory copy operations, which degrade performance, along with the making it harder to schedule operations that could be done in parallel. Limiting the use of shared memory to fork-join parallelism (perhaps with OpenMP) or to its use within the BLAS does not address all these issues.
AB - Power consumption and heat dissipation issues are pushing the microprocessors industry towards multicore design patterns. Given the cubic dependence between core frequency and power consumption, multicore technologies leverage the idea that doubling the number of cores and halving the cores frequency gives roughly the same performance reducing the power consumption by a factor of four. With the number of cores on multicore chips expected to reach tens in a few years, efficient implementations of numerical libraries using shared memory programming models is of high interest. The current message passing paradigm used in ScaLAPACK and elsewhere introduces unnecessary memory overhead and memory copy operations, which degrade performance, along with the making it harder to schedule operations that could be done in parallel. Limiting the use of shared memory to fork-join parallelism (perhaps with OpenMP) or to its use within the BLAS does not address all these issues.
UR - http://www.scopus.com/inward/record.url?scp=38049058008&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-75755-9_1
DO - 10.1007/978-3-540-75755-9_1
M3 - Conference contribution
AN - SCOPUS:38049058008
SN - 9783540757542
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 10
BT - Applied Parallel Computing
PB - Springer Verlag
T2 - 8th International Workshop on Applied Parallel Computing, PARA 2006
Y2 - 18 June 2007 through 21 June 2007
ER -