Benchmarking of TASSER in the ab initio limit

Jose M. Borreguero, Jeffrey Skolnick

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

A significant number of protein sequences in a given proteome have no obvious evolutionarily related protein in the database of solved protein structures, the PDB. Under these conditions, ab initio or template-free modeling methods are the sole means of predicting protein structure. To assess its expected performance on proteomes, the TASSER structure prediction algorithm is benchmarked in the ab initio limit on a representative set of 1129 nonhomologous sequences ranging from 40 to 200 residues that cover the PDB at 30% sequence identity and which adopt α, α + β, and β secondary structures. For sequences in the 40-100 (100-200) residue range, as assessed by their root mean square deviation from native, RMSD, the best of the top five ranked models of TASSER has a global fold that is significantly close to the native structure for 25% (16%) of the sequences, and with a correct identification of the structure of the protein core for 59% (36%). In the absence of a native structure, the structural similarity among the top five ranked models is a moderately reliable predictor of folding accuracy. If we classify the sequences according to their secondary structure content, then 64% (36%) of α, 43% (24%) of α + β, and 20% (12%) of β sequences in the 40-100 (100-200) residue range have a significant TM-score (TM-score ≥0.4). TASSER performs best on helical proteins because there are less secondary structural elements to arrange in a helical protein than in a beta protein of equal length, since the average length of a helix is longer than that of a strand. In addition, helical proteins have shorter loops and dangling tails. If we exclude these flexible fragments, then TASSER has similar accuracy for sequences containing the same number of secondary structural elements, irrespective of whether they are helices and/or strands. Thus, it is the effective configurational entropy of the protein that dictates the average likelihood of correctly arranging the secondary structure elements.

Original languageEnglish
Pages (from-to)48-56
Number of pages9
JournalProteins: Structure, Function and Genetics
Volume68
Issue number1
DOIs
StatePublished - Jul 2007
Externally publishedYes

Keywords

  • Ab initio folding
  • Protein folding
  • Protein structure prediction

Fingerprint

Dive into the research topics of 'Benchmarking of TASSER in the ab initio limit'. Together they form a unique fingerprint.

Cite this