TY - GEN
T1 - BlackjackBench
T2 - 2nd Int. Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS'11, Held as Part of the 24th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC'11
AU - Danalis, Anthony
AU - Luszczek, Piotr
AU - Dongarra, Jack
AU - Marin, Gabriel
AU - Vetter, S.
PY - 2011
Y1 - 2011
N2 - DARPA's AACE project aimed to develop Architecture Aware Compiler Environments that automatically characterize the hardware and optimize the application codes accordingly. We present the BlackjackBench suite, a collection of portable micro-benchmarks that automate system characterization, plus statistical analysis techniques for interpreting the results. The BlackjackBench discovers the effective sizes and speeds of the hardware environment rather than the often unattainable peak values. We aim at hardware features that can be observed by running executables generated by existing compilers from standard C codes. We characterize the memory hierarchy, including cache sharing and NUMA characteristics of the system, properties of the processing cores affecting execution speed, and the length of the OS scheduler time slot. We show how these features of modern multicores can be discovered programmatically. We also show how the features could interfere with each other resulting in incorrect interpretation of the results, and how established classification and statistical analysis techniques reduce experimental noise and aid automatic interpretation of results.
AB - DARPA's AACE project aimed to develop Architecture Aware Compiler Environments that automatically characterize the hardware and optimize the application codes accordingly. We present the BlackjackBench suite, a collection of portable micro-benchmarks that automate system characterization, plus statistical analysis techniques for interpreting the results. The BlackjackBench discovers the effective sizes and speeds of the hardware environment rather than the often unattainable peak values. We aim at hardware features that can be observed by running executables generated by existing compilers from standard C codes. We characterize the memory hierarchy, including cache sharing and NUMA characteristics of the system, properties of the processing cores affecting execution speed, and the length of the OS scheduler time slot. We show how these features of modern multicores can be discovered programmatically. We also show how the features could interfere with each other resulting in incorrect interpretation of the results, and how established classification and statistical analysis techniques reduce experimental noise and aid automatic interpretation of results.
KW - Hardware characterization
KW - Micro-benchmarks
KW - Statistical analysis
UR - http://www.scopus.com/inward/record.url?scp=84856360344&partnerID=8YFLogxK
U2 - 10.1145/2088457.2088463
DO - 10.1145/2088457.2088463
M3 - Conference contribution
AN - SCOPUS:84856360344
SN - 9781450311021
T3 - PMBS'11 - Proceedings of the 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, Co-located with SC'11
SP - 7
EP - 8
BT - PMBS'11 - Proceedings of the 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, Co-located with SC'11
Y2 - 13 November 2011 through 13 November 2011
ER -