TY - GEN
T1 - Parallel simulation of superscalar scheduling
AU - Haugen, Blake
AU - Kurzak, Jakub
AU - Yarkhan, Asim
AU - Luszczek, Piotr
AU - Dongarra, Jack
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/11/13
Y1 - 2014/11/13
N2 - Computers have been moving toward a multicore paradigm for the last several years. As a result of the recent multicore paradigm shift, software developers must design applications that exploit the inherent parallelism of modern computing architectures. One of the areas of research to simplify this shift is the development of dynamic scheduling utilities that allow the developer to specify serial code that can be parallelized using a library or compiler technology. While these tools certainly increase the developer's productivity, they can obfuscate performance bottlenecks. For this reason, it is important to evaluate algorithm performance in order to ensure that the performance of a given algorithm is being realized using dynamic scheduling utilities. This paper presents the methodology and results of a new performance analysis tool that aims to accurately simulate the performance of various superscalar schedulers, including OmpSs, StarPU, and QUARK. The process begins with careful timing of each of the computational routines that make up the algorithm. The simulation tool then uses the timing of the computational kernels in conjunction with the dependency management provided by the superscalar scheduler in order to simulate the execution time of the algorithm. This tool demonstrates that simulation results of various algorithms can accurately predict the performance of a complex dynamic scheduling system.
AB - Computers have been moving toward a multicore paradigm for the last several years. As a result of the recent multicore paradigm shift, software developers must design applications that exploit the inherent parallelism of modern computing architectures. One of the areas of research to simplify this shift is the development of dynamic scheduling utilities that allow the developer to specify serial code that can be parallelized using a library or compiler technology. While these tools certainly increase the developer's productivity, they can obfuscate performance bottlenecks. For this reason, it is important to evaluate algorithm performance in order to ensure that the performance of a given algorithm is being realized using dynamic scheduling utilities. This paper presents the methodology and results of a new performance analysis tool that aims to accurately simulate the performance of various superscalar schedulers, including OmpSs, StarPU, and QUARK. The process begins with careful timing of each of the computational routines that make up the algorithm. The simulation tool then uses the timing of the computational kernels in conjunction with the dependency management provided by the superscalar scheduler in order to simulate the execution time of the algorithm. This tool demonstrates that simulation results of various algorithms can accurately predict the performance of a complex dynamic scheduling system.
KW - Performance modeling
KW - Simulation
KW - Superscalar scheduling
UR - http://www.scopus.com/inward/record.url?scp=84932620214&partnerID=8YFLogxK
U2 - 10.1109/ICPP.2014.21
DO - 10.1109/ICPP.2014.21
M3 - Conference contribution
AN - SCOPUS:84932620214
T3 - Proceedings of the International Conference on Parallel Processing
SP - 121
EP - 130
BT - Proceedings - 43rd International Conference on Parallel Processing, ICPP 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 43rd International Conference on Parallel Processing, ICPP 2014
Y2 - 9 September 2014 through 12 September 2014
ER -