检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张维洵 潘小勇 沈红斌[1] Zhang Weixun;Pan Xiaoyong;Shen Hongbin(Institute of Image Processing and Pattern Recognition,Shanghai Jiao Tong University,Shanghai 200240,China)
机构地区:[1]上海交通大学图像处理与模式识别研究所,上海200240
出 处:《南京理工大学学报》2020年第3期278-287,共10页Journal of Nanjing University of Science and Technology
基 金:国家自然科学基金(61725302,61671288,61903248)。
摘 要:为了提升蛋白质信号肽及其切割位点预测精度,有效区分3种不同类型的信号肽,提出基于位置特异性打分矩阵(PSSM)和同源检测迭代的隐马尔科夫(HMM)文件的深度学习预测方法。设计基于自注意力机制的神经网络模型用于信号肽预测,并使用基于知识迁移的模型集成方法提升预测效果。设计基于门控循环单元(GRU)网络的条件随机场(CRF)来预测信号肽切割位点,并集成领域规则方法提升预测能力。实验结果表明,该文方法对革兰氏阴性菌和革兰氏阳性菌的Sec/SPI、Sec/SPII与Tat/SPI信号肽预测任务的平均马修斯相关系数(MCC)为0.962。该文方法对革兰氏阴性菌和革兰氏阳性菌的Sec/SPI、Sec/SPII与Tat/SPI信号肽切割位点预测任务的平均召回率和准确率分别为0.698和0.662。在部分信号肽样本上,该文方法能正确预测SignalP 5.0方法预测错误的样本,2种方法在切割位点的预测上存在着一定的互补性。In order to improve the prediction accuracy of protein signal peptides and their cleavage sites,and effectively distinguish three different types of signal peptides,a novel deep learning-based method based on the position specific scoring matrix(PSSM)and the hidden Markov model(HMM)profile of iteration of homologous detection is proposed.A neural network based on self-attention mechanism for signal peptide prediction is designed,and model integration based on knowledge transfer is used to improve the prediction performance.A conditional random field(CRF)based on a gated recurrent unit(GRU)network is designed for predicting signal peptide cleavage sites,and a domain rule-based method is integrated to improve the prediction ability.The results showed that the average Matthew’s correlation coefficients(MCC)of Sec/SPI,Sec/SPII and Tat/SPI signal peptide prediction for gram-negative bacterium and gram-positive bacterium is 0.962.The average recall rate and accuracy rate of Sec/SPI,Sec/SPII and Tat/SPI signal peptide cleavage sites predicting for gram-negative bacterium and gram-positive bacterium are 0.698 and 0.662 respectively.In some signal peptide samples,this method can correctly predict the wrong samples of SignalP 5.0 method,and the two methods are complementary in the signal peptide cleavage sites predicting.
关 键 词:深度学习 领域规则 蛋白质 信号肽 知识迁移 门控循环单元 条件随机场
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15