基于KPCA及SVM的蛋白质O-糖基化位点的预测  被引量:3

Prediction of the Protein O-glycosylation by Kernel Principal Component Analysis and Support Vector Machines

在线阅读下载全文

作  者:杨雪梅[1] 苏祯[2] 

机构地区:[1]咸阳师范学院数学与信息科学学院,咸阳712000 [2]西南财经大学国际商学院,成都610074

出  处:《科学技术与工程》2013年第25期7371-7376,共6页Science Technology and Engineering

基  金:陕西省教育厅2013年度科学研究计划项目(2013JK1125)资助

摘  要:为了提高蛋白质O-糖基化位点的预测准确率,提出了把核主成分分析(KPCA)与支持向量机(SVM)相结合的方法。实验样本用稀疏编码方式编码,窗口长度为21。首先,用核主成分分析提取了样本的核主成分(特征);然后,在特征空间中用改进的支持向量机(ISVM)进行分类(预测)。在使用支持向量机分类时,设置了一个边界系数αc来减少运算的复杂度。实验结果表明,使用KPCA+ISVM的方法预测的效果优于PCA+SVM的预测效果。预测准确率为87%。更进一步,用不同长度的样本做实验(w=5,7,9,11,21,31,41,51),使用多数投票法综合各子分类器的优势。结果表明,组合分类器的预测准确率优于子分类器的预测准确率,预测准确率为88%。To improve the prediction accuracy of O-glycosylation sites, a new method of KPCA + ISVM was pro- posed. The samples for experiment were encoded by the sparse coding with window size w = 21, kernel principal com- ponents (feature) were extracted by kernel principal component analysis ( KPCA), then the prediction (classification) was done in feature space by improved support vector machines (ISVM). When using ISVM, a bound coefficient ctc was defined to reduce the complexity of model. The results of experiment show that the performance of KPCA + ISVM is better than that of PCA + SVM and SVM. The prediction accuracy is about 87%. Furthermore, the same protein sequence under various window size (w = 5,7,9,11,21,31,41,51 )was investigated, and the majority-vote scheme was used to combine all the pre-classifiers to improve the prediction performance. The results indicate that the perform- ance of ensembles of KPCA + ISVM is superior to that of pre-classifier. The prediction accuracy is about 88%

关 键 词:预测蛋白质 核主成分分析 改进的支持向量机 组合分类器 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象