TY - JOUR
T1 - Decision tree classification of proteins identified by mass spectrometry of blood serum samples from people with and without lung cancer
AU - Markey, Mia K.
AU - Tourassi, Georgia D.
AU - Floyd, Carey E.
PY - 2003/9/1
Y1 - 2003/9/1
N2 - A classification and regression tree (CART) model was trained to classify 41 clinical specimens as disease/nondisease based on 26 variables computed from the mass-to-charge ratio (m/z) and peak heights of proteins identified by mass spectroscopy. The CART model built on all of the specimens (no cross-validation) had an error rate of 4/41 = 10%. The CART model suggests that mass spectra peaks in the 8000-10 000, 20 000-30 000, 45 000-60 000, and >125 000 m/z ranges may be valuable in distinguishing between the disease/nondisease specimens. The area under the receiver operating characteristics curve was 0.80 ± 0.07 for leave-one-out cross-validation.
AB - A classification and regression tree (CART) model was trained to classify 41 clinical specimens as disease/nondisease based on 26 variables computed from the mass-to-charge ratio (m/z) and peak heights of proteins identified by mass spectroscopy. The CART model built on all of the specimens (no cross-validation) had an error rate of 4/41 = 10%. The CART model suggests that mass spectra peaks in the 8000-10 000, 20 000-30 000, 45 000-60 000, and >125 000 m/z ranges may be valuable in distinguishing between the disease/nondisease specimens. The area under the receiver operating characteristics curve was 0.80 ± 0.07 for leave-one-out cross-validation.
KW - Classification
KW - Computer-aided diagnosis
KW - Decision tree, classification and regression tree
KW - Mass spectrometry
UR - http://www.scopus.com/inward/record.url?scp=0141855285&partnerID=8YFLogxK
U2 - 10.1002/pmic.200300521
DO - 10.1002/pmic.200300521
M3 - Article
C2 - 12973724
AN - SCOPUS:0141855285
SN - 1615-9853
VL - 3
SP - 1678
EP - 1679
JO - Proteomics
JF - Proteomics
IS - 9
ER -