Abstract
Proteomic techniques are fast becoming the main method for qualitative and quantitative determination of the protein content in biological systems. Despite notable advances, efficient and accurate analysis of high throughput proteomic data generated by mass spectrometers remains one of the major stumbling blocks in the protein identification problem. We present a model for the number of random matches between an experimental MS-MS spectrum and a theoretical spectrum of a peptide. The shape of the probability distribution is a function of the experimental accuracy, the number of peaks in the experimental spectrum, the length of the interval over which the peaks are distributed, and the number of theoretical spectral peaks in this interval. Based on this probability distribution, a goodness-of-fit tool can be used to yield fast and accurate scoring schemes for peptide identification through database search. In this paper, we describe one possible implementation of such a method and compare the performance of the resulting scoring function with that of SEQUEST. In terms of speed, our algorithm is roughly two orders of magnitude faster than the SEQUEST program, and its accuracy of peptide identification compares favorably to that of SEQUEST. Moreover, our algorithm does not use information related to the intensities of the peaks.
Original language | English |
---|---|
Pages (from-to) | 455-476 |
Number of pages | 22 |
Journal | Journal of Bioinformatics and Computational Biology |
Volume | 3 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2005 |
Funding
The work was supported by the Office of Biological and Environmental Research, U.S. Department of Energy, under Contract DE-AC05-00OR22725, managed by UT-Battelle, LLC. Ying Xu’s work is supported, in part, by NSF fund # DBI-0213840. We thank Bob Hettich, David Tabb, Chandra Naramsinhan, Ed Uber-bacher, Victor Olman, and Andrey Gorin for fruitful discussions and overall support.
Funders | Funder number |
---|---|
National Science Foundation | DBI-0213840 |
U.S. Department of Energy | DE-AC05-00OR22725 |
Biological and Environmental Research | |
UT-Battelle |
Keywords
- Database search
- Mass spectrometry
- Tandem