Abstract
Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Remarkably accurate atomic structures have been recently computed for individual proteins by AlphaFold2 (AF2). Here, we demonstrate that the same neural network models from AF2 developed for single protein sequences can be adapted to predict the structures of multimeric protein complexes without retraining. In contrast to common approaches, our method, AF2Complex, does not require paired multiple sequence alignments. It achieves higher accuracy than some complex protein-protein docking strategies and provides a significant improvement over AF-Multimer, a development of AlphaFold for multimeric proteins. Moreover, we introduce metrics for predicting direct protein-protein interactions between arbitrary protein pairs and validate AF2Complex on some challenging benchmark sets and the E. coli proteome. Lastly, using the cytochrome c biogenesis system I as an example, we present high-confidence models of three sought-after assemblies formed by eight members of this system.
Original language | English |
---|---|
Article number | 1744 |
Journal | Nature Communications |
Volume | 13 |
Issue number | 1 |
DOIs | |
State | Published - Dec 2022 |
Funding
We thank Ada Sedova for coordinating the deployment of AlphaFold2 on Summit at Oak Ridge and critical reading of the manuscript, Ryan Prout, Subil Abraham, Wael Elwasif, N. Quentin Haas for building a Singularity container, and Mark Coletti for providing Dask scripts for running AF2. We thank Jessica Forness for proofreading the manuscript. This work was supported in part by the DOE Office of Science, Office of Biological and Environmental Research (DOE DE-SC0021303, J.S. and J.P.) and the Division of General Medical Sciences of the National Institute Health (NIH R35GM118039, J.S.). The research used resources supported in part by the Director’s Discretion Project at the Oak Ridge Leadership Computing Facility, and the Advanced Scientific Computing Research (ASCR) Leadership Computing Challenge (ALCC) program (J.S, J.P., and M.G.). We also acknowledge the computing resources provided by the Partnership for an Advanced Computing Environment (PACE) at the Georgia Institute of Technology. We thank Ada Sedova for coordinating the deployment of AlphaFold2 on Summit at Oak Ridge and critical reading of the manuscript, Ryan Prout, Subil Abraham, Wael Elwasif, N. Quentin Haas for building a Singularity container, and Mark Coletti for providing Dask scripts for running AF2. We thank Jessica Forness for proofreading the manuscript. This work was supported in part by the DOE Office of Science, Office of Biological and Environmental Research (DOE DE-SC0021303, J.S. and J.P.) and the Division of General Medical Sciences of the National Institute Health (NIH R35GM118039, J.S.). The research used resources supported in part by the Director’s Discretion Project at the Oak Ridge Leadership Computing Facility, and the Advanced Scientific Computing Research (ASCR) Leadership Computing Challenge (ALCC) program (J.S, J.P., and M.G.). We also acknowledge the computing resources provided by the Partnership for an Advanced Computing Environment (PACE) at the Georgia Institute of Technology.
Funders | Funder number |
---|---|
Division of General Medical Sciences | |
National Institutes of Health | |
U.S. Department of Energy | DE-SC0021303 |
National Institute of General Medical Sciences | R35GM118039 |
Office of Science | |
Advanced Scientific Computing Research | |
Biological and Environmental Research | |
Georgia Institute of Technology |