Abstract
It has become common to perform kinetic analysis using approximate Koopman operators that transform high-dimensional timeseries of observables into ranked dynamical modes. The key to the practical success of the approach is the identification of a set of observables that form a good basis on which to expand the slow relaxation modes. Good observables are, however, difficult to identify a priori and suboptimal choices can lead to significant underestimations of characteristic time scales. Leveraging the representation of slow dynamics in terms of Hidden Markov Models (HMM), we propose a simple and computationally efficient clustering procedure to infer surrogate observables that form a good basis for slow modes. We apply the approach to an analytically solvable model system as well as on three protein systems of different complexities. We consistently demonstrate that the inferred indicator functions can significantly improve the estimation of the leading eigenvalues of Koopman operators and correctly identify key states and transition time scales of stochastic systems, even when good observables are not known a priori.
Original language | English |
---|---|
Pages (from-to) | 7187-7198 |
Number of pages | 12 |
Journal | Journal of Chemical Theory and Computation |
Volume | 19 |
Issue number | 20 |
DOIs | |
State | Published - Oct 24 2023 |
Funding
This work has been authored by employees of Triad National Security, LLC, which operates Los Alamos National Laboratory (LANL) under Contract No. 89233218CNA000001 with the U.S. Department of Energy/National Nuclear Security Administration. The work has been supported by the LDRD (Laboratory Directed Research and Development) program at LANL under Project 20190034ER (Massively Parallel Acceleration of the Dynamics of Complex Systems: a Data-Driven Approach). V.A.N. was partially supported by Director’s Postdoctoral Fellowship, 20170692PRD4, for this work. V.A.N. is supported by the Oak Ridge National Laboratory, which is managed by UT-Battelle under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. This research used resources of the Oak Ridge Leadership Computing Facility (OLCF). The authors thank D.E. Shaw for making their MD data available for this study.