随机森林算法在β-发夹模体预测中的应用  

Application of Random Forests Algorithm in β-hairpins Motif Prediction

在线阅读下载全文

作  者:贾少春[1] 

机构地区:[1]忻州师范学院,山西忻州034000

出  处:《忻州师范学院学报》2015年第5期6-9,28,共5页Journal of Xinzhou Teachers University

摘  要:基于对β-发夹模体预测的探索,文章尝试使用新的预测方法,即随机森林算法,以离散增量、矩阵打分值和预测的二级结构信息为特征参数,对Arch DB40数据库中loop长为2-8个氨基酸残基的β-发夹模体进行预测,将数据集平均分成5份,其中1份做训练集、4份做检验集,独立检验的预测精度为79.4%,相关系数为0.48。此外,对Arch DB40数据库中的β-发夹模体进行预测,在特征参数和检验方法相同的情况下,随机森林算法的预测效果要好于支持向量机(SVM)。Based on the exploration of recognizing β-hairpins motif,we present a novel method,random forests algorithm is proposed in this paper. By using the increment of diversity,the position weight matrix score and the predicted secondary structure as a characteristic parameter. The prediction was based on the β-hairpin motifs in Arch DB40 dataset. The motifs with the loop length of 2 to 8 are extracted as research object. the dataset was divided into five sets in this paper,one was used as training set and the others were used as testing set. The overall accuracy of prediction and Matthew's correlation coefficient are 79. 4% and 0. 48 in the independent testing. In addition,to predict the β-hairpin motifs in Arch DB40 dataset,under the condition of the same characteristic parameters and testing methods,the prediction effect of random forest algorithm is better than the support vector machine( SVM).

关 键 词:随机森林算法 离散增量 矩阵打分函数 Β-发夹模体 

分 类 号:O24[理学—计算数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象