A SELDI-TOF-MS Data Classification Method for Prostate Based on Probabilistic Principal Components Analysis and Support Vector Machine[J]. 2016,35(1):91-95.
A SELDI-TOF-MS Data Classification Method for Prostate Based on Probabilistic Principal Components Analysis and Support Vector Machine[J]. 2016,35(1):91-95.DOI:
基于前列腺癌检测中获取的表面增强激光解吸/离子化飞行时间质谱 (SELDI-TOF-MS)数据,提出一种概率主成分分析(PPCA)联合支持向量机(SVM)的分类方法。对临床322例血清样本的质谱数据进行特征提取,以随机选取训练样本集(225例)构造SVM判别模型,对剩余样本集(97例)进行测试。采用均方根误差、识别率与预测率指标,将所构造的PPCA-SVM模型分别与偏最小二乘(Partial least squares,PLS)和PCA-SVM模型进行比较,发现PLS模型的识别率和预测率分别为90.92%和76.38%,PCA-SVM模型分别为99.23%和 84.63%,而PPCA-SVM模型分别为99.01%和90.41%。因此SELDI-TOF-MS技术结合PPCA-SVM在样品分类中具有准确、重复性好等优点,为前列腺癌早期诊断提供了一种新方法。
Abstract
A method combined probabilistic principal components analysis(PPCA) with support vector machine(SVM) was presented for analyzing SELDI-TOF-MS data generated from clinical proteomic study.Using PPCA for feature extraction on 322 MS data set,225 MS data set were randomly selected as learning set for establish SVM model,and the remaining 97 data set were selected as a testing set for prediction and verification.Root mean square error,recognition rate and predictive rate were used to evaluate the model′s classification performance,respectively.To verify the PPCA-SVM model′s classification performance further,the proposed model with partial least squares (PLS) model and PCA-SVM model were compared.The results showed that the recognition rates for PLS,PCA-SVM and PPCA-SVM were 90.92%, 99.23%and 99.01%,respectively,the predictive rates for PLS,PCA-SVM and PPCA-SVM were 76.38%,84.63% and 90.41%,respetively.Experimental results showed that proposed PPCA-SVM model was an accurate and repeatable method for automatically detecting prostate cancer.The method provides a new approach for early diagnosis of prostate cancer in clinic.
关键词
前列腺癌概率主成分分析支持向量机SELDI-TOF-MS
Keywords
prostate cancerprobabilistic principal components analysissupport vector machinesSELDI-TOF-MS