TY - GEN
T1 - Performance characterization of molecular dynamics techniques for biomolecular simulations
AU - Alam, Sadaf R.
AU - Vetter, Jeffrey S.
AU - Agarwal, Pratul K.
AU - Geist, Al
PY - 2006
Y1 - 2006
N2 - Large-scale simulations and computational modeling using molecular dynamics (MD) continues to make significant impacts in the field of biology. It is well known that simulations of biological events at native time and length scales requires computing power several orders of magnitude beyond today's commonly available systems. Supercomputers, such as IBM Blue Gene/L and Cray XT3, will soon make tens to hundreds of teraFLOP/s of computing power available by utilizing thousands of processors. The popular algorithms and MD applications, however, were not initially designed to run on thousands of processors. In this paper, we present detailed investigations of the performance issues, which are crucial for improving the scalability of the MD-related algorithms and applications on massively parallel processing (MPP) architectures. Due to the varying characteristics of biological input problems, we study two prototypical biological complexes that use the MD algorithm: an explicit solvent and an implicit solvent. In particular, we study the AMBER application, which supports a variety of these types of input problems. For the explicit solvent problem, we focused on the particle mesh Ewald (PME) method for calculating the electrostatic energy, and for the implicit solvent model, we targeted the Generalized Born (GB) calculation. We uncovered and subsequently modified a limitation in AMBER that restricted the scaling beyond 128 processors. We collected performance data for experiments on up to 2048 Blue Gene/L and XT3 processors and subsequently identified that the scaling is largely limited by the underlying algorithmic characteristics and also by the implementation of the algorithms. Furthermore, we found that the input problem size of biological system is constrained by memory available per node. In conclusion, our results indicate that MD codes can significantly benefit from the current generation architectures with relatively modest optimization efforts. Nevertheless, the key for enabling scientific breakthroughs lies in exploiting the full potential of these new architectures.
AB - Large-scale simulations and computational modeling using molecular dynamics (MD) continues to make significant impacts in the field of biology. It is well known that simulations of biological events at native time and length scales requires computing power several orders of magnitude beyond today's commonly available systems. Supercomputers, such as IBM Blue Gene/L and Cray XT3, will soon make tens to hundreds of teraFLOP/s of computing power available by utilizing thousands of processors. The popular algorithms and MD applications, however, were not initially designed to run on thousands of processors. In this paper, we present detailed investigations of the performance issues, which are crucial for improving the scalability of the MD-related algorithms and applications on massively parallel processing (MPP) architectures. Due to the varying characteristics of biological input problems, we study two prototypical biological complexes that use the MD algorithm: an explicit solvent and an implicit solvent. In particular, we study the AMBER application, which supports a variety of these types of input problems. For the explicit solvent problem, we focused on the particle mesh Ewald (PME) method for calculating the electrostatic energy, and for the implicit solvent model, we targeted the Generalized Born (GB) calculation. We uncovered and subsequently modified a limitation in AMBER that restricted the scaling beyond 128 processors. We collected performance data for experiments on up to 2048 Blue Gene/L and XT3 processors and subsequently identified that the scaling is largely limited by the underlying algorithmic characteristics and also by the implementation of the algorithms. Furthermore, we found that the input problem size of biological system is constrained by memory available per node. In conclusion, our results indicate that MD codes can significantly benefit from the current generation architectures with relatively modest optimization efforts. Nevertheless, the key for enabling scientific breakthroughs lies in exploiting the full potential of these new architectures.
KW - Computational biology
KW - Molecular dynamics algorithms
KW - Performance analysis
KW - Workload characterization
UR - http://www.scopus.com/inward/record.url?scp=33751040386&partnerID=8YFLogxK
U2 - 10.1145/1122971.1122983
DO - 10.1145/1122971.1122983
M3 - Conference contribution
AN - SCOPUS:33751040386
SN - 1595931899
SN - 9781595931894
T3 - Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
SP - 59
EP - 68
BT - Proceedings of the 2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06
PB - Association for Computing Machinery (ACM)
T2 - 2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06
Y2 - 29 March 2006 through 31 March 2006
ER -