检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:方春 田爱奎[1] 孙福振[1] 李彩虹[1] 朱大铭[2] FANG Chun;TIAN Aikui;SUN Fuzhen;LI Caihong;ZHU Darning(School of Computer Science and Technology,Shandong University of Technology,Zibo 255049,China;Shandong Provincial Key Laboratory of Software Engineering,Shandong University,Jinan 250000,China)
机构地区:[1]山东理工大学计算机科学与技术学院,山东淄博255049 [2]山东大学山东省软件工程重点实验室,山东济南250000
出 处:《济南大学学报(自然科学版)》2018年第4期280-285,共6页Journal of University of Jinan(Science and Technology)
基 金:国家自然科学基金项目(61602280;61473179);山东省自然科学基金项目(ZR2014FQ028)
摘 要:针对目前实验方法识别天然无序蛋白质中的功能模体耗时费力、难度大,而传统计算机辅助识别方法过于依赖人工挑选特征且准确度低等问题,提出一种利用深度卷积神经网络预测功能模体位置的方法;该方法直接将蛋白质序列作为输入,通过计算对应的位置特异性打分矩阵和3组氨基酸指数特征,将序列映射到数值矩阵中,模型自行抽取特征并自动识别功能模体的隐性序列模式来进行预测。结果表明:当使用相同数据集进行训练和测试时,本文中提出的方法的性能明显优于其他传统的识别算法,在验证集上的感受性曲线下的面积(AUC)值达到0.708,在测试集上的AUC值达到0.760,说明深度卷积神经网络能够有效地识别功能模体的隐性序列模式;该方法也可以用于其他聚集型蛋白质功能位点的识别。Aiming at the problem that identifying molecular recognition feature( MoRF) in intrinsic disordered proteins was complicated and difficult,while traditional prediction algorithms generally relied on artificial feature extraction and their accuracy was still low,a novel method based on deep convolution neural network was proposed for identifying MoRF in protein sequence. This method took the protein sequence as input directly,and maped the sequence to a feature matrix by calculating the position-specific scoring matrix of the sequence and three groups of amino acid indexes. The deep learning model extracted features and identified the recessive sequence pattern of MoRF automatically. The experimental results show that,using the same training and testing datasets,the proposed method obviously outperformes other traditional methods,achieving the value of area under curve( AUC) of the receiver operating characteristics 0.708 on the validation dataset and the AUC value 0.760 on the test dataset,which suggests that the deep convolution neural network provides an effective way to improve the MoRFs predication. This method can also be used to identify other aggregated functional sites of proteins.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117