检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨欣华 顾海明[1] YANG Xinhua;GU Haiming(College of Mathematics and Physics,Qingdao University of Science and Technology,Qingdao 266061,China)
出 处:《青岛科技大学学报(自然科学版)》2021年第6期101-110,共10页Journal of Qingdao University of Science and Technology:Natural Science Edition
基 金:国家自然科学基金面上项目(62172248).
摘 要:提出了一种新的蛋白质折叠识别方法-BAG-fold模型。首先,通过伪位置特异性得分矩阵(pseudo position specific score matrix,PsePSSM)方法,二级结构(secondary structure,SS)方法,分组重量编码(encoding based on grouped weight,EBGW)方法和去趋势互相关分析(detrended cross-correlation analysis,DCCA)方法,共4种方法提取蛋白质序列的特征信息,并由4种特征信息得到混合特征空间。其次,采用局部Fisher判别分析(linear Fisher discriminant analysis,LFDA)减少冗余信息以选取最优特征子集。最后,将最优特征子集输入到Bagging集成分类器中进行蛋白质折叠识别。使用10折交叉验证在DD数据集和RDD数据集的精度分别达到了96.8%和98.8%。实验结果表明,提出的BAG-fold方法明显优于其它预测方法。In this article,we propose a new protein fold recognition method-BAG-fold model.First,through the pseudo position specific score matrix(PsePSSM)method,secondary structure(SS)method,Encoding based on grouped weight(EBGW)method and detrended cross-correlation analysis(DCCA)method,there are four methods to extract the feature information of protein sequence,and the mixed feature space is obtained from the four types of feature information.Secondly,using linear Fisher discriminant analysis(LFDA)reduces redundant information to select the optimal feature subset.Finally,the optimal feature subset is input into the Bagging ensemble classifier for protein folding recognition.Using 10-fold cross-validation,the accuracy of the DD dataset and RDD dataset reached 96.8%and 98.8%,respectively.Experimental results show that the BAG-fold method proposed in this paper is significantly better than other prediction methods.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3