Learning Equations from Biological Data with Limited Time Samples

John T. Nardini, John H. Lagergren, Andrea Hawkins-Daarud, Lee Curtin, Bethan Morris, Erica M. Rutter, Kristin R. Swanson, Kevin B. Flores

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

Equation learning methods present a promising tool to aid scientists in the modeling process for biological data. Previous equation learning studies have demonstrated that these methods can infer models from rich datasets; however, the performance of these methods in the presence of common challenges from biological data has not been thoroughly explored. We present an equation learning methodology comprised of data denoising, equation learning, model selection and post-processing steps that infers a dynamical systems model from noisy spatiotemporal data. The performance of this methodology is thoroughly investigated in the face of several common challenges presented by biological data, namely, sparse data sampling, large noise levels, and heterogeneity between datasets. We find that this methodology can accurately infer the correct underlying equation and predict unobserved system dynamics from a small number of time samples when the data are sampled over a time interval exhibiting both linear and nonlinear dynamics. Our findings suggest that equation learning methods can be used for model discovery and selection in many areas of biology when an informative dataset is used. We focus on glioblastoma multiforme modeling as a case study in this work to highlight how these results are informative for data-driven modeling-based tumor invasion predictions.

Original languageEnglish
Article number119
JournalBulletin of Mathematical Biology
Volume82
Issue number9
DOIs
StatePublished - Sep 1 2020
Externally publishedYes

Funding

This material was based upon work partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute and IOS-1838314 to KBF, and in part by National Institute of Aging Grant R21AG059099 to KBF. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. BM gratefully acknowledges Ph.D. studentship funding from the UK EPSRC (reference EP/N50970X/1). AHD, LC, and KRS gratefully acknowledge funding through the NIH U01CA220378 and the James S. McDonnell Foundation 220020264. Funding was provided by National Science Foundation (Grant Nos. 1638521, IOS-1838314), National Institute on Aging (Grant No. R21AG059099), National Institutes of Health (Grant No. U01CA220378), James S. McDonnell Foundation (Grant No. 220020264) and Engineering and Physical Sciences Research Council (Grant No. EP/N50970X/1).

Keywords

  • Equation learning
  • Glioblastoma multiforme
  • Model selection
  • Numerical differentiation
  • Parameter estimation
  • Partial differential equations
  • Population dynamics
  • Sparse regression

Fingerprint

Dive into the research topics of 'Learning Equations from Biological Data with Limited Time Samples'. Together they form a unique fingerprint.

Cite this