TY - GEN
T1 - Active-learning-based surrogate models for empirical performance tuning
AU - Balaprakash, Prasanna
AU - Gramacy, Robert B.
AU - Wild, Stefan M.
PY - 2013
Y1 - 2013
N2 - Performance models have profound impact on hardware-software codesign, architectural explorations, and performance tuning of scientific applications. Developing algebraic performance models is becoming an increasingly challenging task. In such situations, a statistical surrogate-based performance model, fitted to a small number of input-output points obtained from empirical evaluation on the target machine, provides a range of benefits. Accurate surrogates can emulate the output of the expensive empirical evaluation at new inputs and therefore can be used to test and/or aid search, compiler, and autotuning algorithms. We present an iterative parallel algorithm that builds surrogate performance models for scientific kernels and workloads on single-core and multicore and multinode architectures. We tailor to our unique parallel environment an active learning heuristic popular in the literature on the sequential design of computer experiments in order to identify the code variants whose evaluations have the best potential to improve the surrogate. We use the proposed approach in a number of case studies to illustrate its effectiveness.
AB - Performance models have profound impact on hardware-software codesign, architectural explorations, and performance tuning of scientific applications. Developing algebraic performance models is becoming an increasingly challenging task. In such situations, a statistical surrogate-based performance model, fitted to a small number of input-output points obtained from empirical evaluation on the target machine, provides a range of benefits. Accurate surrogates can emulate the output of the expensive empirical evaluation at new inputs and therefore can be used to test and/or aid search, compiler, and autotuning algorithms. We present an iterative parallel algorithm that builds surrogate performance models for scientific kernels and workloads on single-core and multicore and multinode architectures. We tailor to our unique parallel environment an active learning heuristic popular in the literature on the sequential design of computer experiments in order to identify the code variants whose evaluations have the best potential to improve the surrogate. We use the proposed approach in a number of case studies to illustrate its effectiveness.
UR - http://www.scopus.com/inward/record.url?scp=84893619693&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2013.6702683
DO - 10.1109/CLUSTER.2013.6702683
M3 - Conference contribution
AN - SCOPUS:84893619693
SN - 9781479908981
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
BT - 2013 IEEE International Conference on Cluster Computing, CLUSTER 2013
T2 - 15th IEEE International Conference on Cluster Computing, CLUSTER 2013
Y2 - 23 September 2013 through 27 September 2013
ER -