Potentially adaptive SARS-CoV-2 mutations discovered with novel spatiotemporal and explainable AI models

Michael R. Garvin, Erica T. Prates, Mirko Pavicic, Piet Jones, B. Kirtley Amos, Armin Geiger, Manesh B. Shah, Jared Streich, Joao Gabriel Felipe Machado Gazolla, David Kainer, Ashley Cliff, Jonathon Romero, Nathan Keith, James B. Brown, Daniel Jacobson

Research output: Contribution to journalArticlepeer-review

52 Scopus citations

Abstract

Background: A mechanistic understanding of the spread of SARS-CoV-2 and diligent tracking of ongoing mutagenesis are of key importance to plan robust strategies for confining its transmission. Large numbers of available sequences and their dates of transmission provide an unprecedented opportunity to analyze evolutionary adaptation in novel ways. Addition of high-resolution structural information can reveal the functional basis of these processes at the molecular level. Integrated systems biology-directed analyses of these data layers afford valuable insights to build a global understanding of the COVID-19 pandemic. Results: Here we identify globally distributed haplotypes from 15,789 SARS-CoV-2 genomes and model their success based on their duration, dispersal, and frequency in the host population. Our models identify mutations that are likely compensatory adaptive changes that allowed for rapid expansion of the virus. Functional predictions from structural analyses indicate that, contrary to previous reports, the Asp614Gly mutation in the spike glycoprotein (S) likely reduced transmission and the subsequent Pro323Leu mutation in the RNA-dependent RNA polymerase led to the precipitous spread of the virus. Our model also suggests that two mutations in the nsp13 helicase allowed for the adaptation of the virus to the Pacific Northwest of the USA. Finally, our explainable artificial intelligence algorithm identified a mutational hotspot in the sequence of S that also displays a signature of positive selection and may have implications for tissue or cell-specific expression of the virus. Conclusions: These results provide valuable insights for the development of drugs and surveillance strategies to combat the current and future pandemics.

Original languageEnglish
Article number304
JournalGenome Biology
Volume21
Issue number1
DOIs
StatePublished - Dec 2020

Funding

We would like to acknowledge funding from the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the US Department of Energy (LOIS:10074) (which supported the genome sequence collection and curation), DOE Office of Science through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on response to COVID-19 (which supported the work for potential points of diagnostic and therapeutic intervention), with funding provided by the Coronavirus CARES Act. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. We gratefully acknowledge the Originating laboratories responsible for obtaining the viral specimens and the Submitting laboratories where genetic sequence data were generated and shared via the GISAID Initiative, on which this research is based. This work was supported by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the US Department of Energy (LOIS:10074), DOE Office of Science through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on response to COVID-19, with funding provided by the Coronavirus CARES Act. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ).

FundersFunder number
National Virtual Biotechnology Laboratory
US Department of Energy
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science
Oak Ridge National Laboratory
UT-Battelle

    Keywords

    • Adaptive mutation
    • COVID-19
    • Coronavirus
    • Local adaptation
    • Molecular evolution
    • SARS-CoV-2

    Fingerprint

    Dive into the research topics of 'Potentially adaptive SARS-CoV-2 mutations discovered with novel spatiotemporal and explainable AI models'. Together they form a unique fingerprint.

    Cite this