基于Bagging集成学习的蛋白质折叠识别

Protein Folding Recognition Based on Bagging Ensemble Learning

作　　者：杨欣华顾海明[1] YANG Xinhua;GU Haiming(College of Mathematics and Physics,Qingdao University of Science and Technology,Qingdao 266061,China)

机构地区：[1]青岛科技大学数理学院,山东青岛266061

出　　处：《青岛科技大学学报（自然科学版）》2021年第6期101-110,共10页Journal of Qingdao University of Science and Technology:Natural Science Edition

基　　金：国家自然科学基金面上项目(62172248).

摘　　要：提出了一种新的蛋白质折叠识别方法-BAG-fold模型。首先,通过伪位置特异性得分矩阵(pseudo position specific score matrix,PsePSSM)方法,二级结构(secondary structure,SS)方法,分组重量编码(encoding based on grouped weight,EBGW)方法和去趋势互相关分析(detrended cross-correlation analysis,DCCA)方法,共4种方法提取蛋白质序列的特征信息,并由4种特征信息得到混合特征空间。其次,采用局部Fisher判别分析(linear Fisher discriminant analysis,LFDA)减少冗余信息以选取最优特征子集。最后,将最优特征子集输入到Bagging集成分类器中进行蛋白质折叠识别。使用10折交叉验证在DD数据集和RDD数据集的精度分别达到了96.8%和98.8%。实验结果表明,提出的BAG-fold方法明显优于其它预测方法。In this article,we propose a new protein fold recognition method-BAG-fold model.First,through the pseudo position specific score matrix(PsePSSM)method,secondary structure(SS)method,Encoding based on grouped weight(EBGW)method and detrended cross-correlation analysis(DCCA)method,there are four methods to extract the feature information of protein sequence,and the mixed feature space is obtained from the four types of feature information.Secondly,using linear Fisher discriminant analysis(LFDA)reduces redundant information to select the optimal feature subset.Finally,the optimal feature subset is input into the Bagging ensemble classifier for protein folding recognition.Using 10-fold cross-validation,the accuracy of the DD dataset and RDD dataset reached 96.8%and 98.8%,respectively.Experimental results show that the BAG-fold method proposed in this paper is significantly better than other prediction methods.

关键词：蛋白质折叠多信息融合去趋势互相关分析法局部Fisher判别分析 Bagging集成学习

分类号：Q811.4[生物学—生物工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Bagging集成学习的蛋白质折叠识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Bagging集成学习的蛋白质折叠识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索