检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:牟长宁 王海鹏[1] 周丕宇 侯鑫行 MOU Changning;WANG Haipeng;ZHOU Piyu;HOU Xinhang(School of Computer Science and Technology,Shandong University of Technology,Zibo Shandong 255000,China)
机构地区:[1]山东理工大学计算机科学与技术学院,山东淄博255000
出 处:《计算机应用》2021年第9期2773-2779,共7页journal of Computer Applications
基 金:国家自然科学基金资助项目(31500669);山东省自然科学基金资助项目(ZR2014FQ024);山东省高等学校优秀青年创新团队支持计划项目(2019KJN048)。
摘 要:在蛋白质组学中从头测序是串联质谱肽段测序的重要方法之一,其具有不依赖于蛋白质数据库的优势,并在测定未知物种蛋白序列、单克隆抗体测序等领域中起着关键作用。然而由于从头测序的复杂性,导致其测序的准确率远低于数据库搜索方法,制约了从头测序的广泛应用。针对从头测序准确率低的问题,提出一种基于图卷积神经网络(GCN)的从头测序方法denovo-GCN。该方法将质谱中谱峰之间的关系用图结构表示,并从每个相应的肽碎裂位点提取谱峰特征,然后通过GCN预测当前碎裂位点处的氨基酸类型,最后逐步组成完整的肽序列。通过实验确定了GCN模型的层数、离子类型组合和测序使用的谱峰数量这3个影响模型的重要参数,并将多个物种数据集用于实验对比。实验结果表明,该方法在肽水平上的召回率比基于图论的从头测序方法Novor、pNovo提高了4.0~21.1个百分点,比基于卷积神经网络(CNN)和长短期记忆(LSTM)网络的DeepNovo提高了2.1~10.7个百分点。In proteomics,de novo sequencing is one of the most important methods for peptide sequencing by tandem mass spectrometry.It has the advantage of being independent on any protein databases and plays a key role in the determination of protein sequences of unknown species,monoclonal antibodies sequencing and other fields.However,due to its complexity,the accuracy of de novo sequencing is much lower than that of the database search methods,therefore the wide application of de novo sequencing is limited.Focused on the issue of low accuracy of de novo sequencing,denovo-GCN,a de novo sequencing method based on Graph Convolutional neural Network(GCN)was proposed.In this method,the relationships between peaks in mass spectrometry were expressed by using graph structure,and the peak features were extracted from each corresponding peptide cleavage site.Then the amino acid type at the current cleavage site was predicted by GCN,and finally a complete sequence was formed step by step.Three significant parameters affecting the model were experimentally determined,including the GCN model layer number,the combination of ion types and the number of spectral peaks used for sequencing,and datasets of a wide variety of species were used for experimental comparison.Experimental results show that,the peptide-level recall of denovo-GCN is 4.0 percentage points to 21.1 percentage points higher than those of the graph theory-based methods Novor and pNovo,and is 2.1 percentage points to 10.7 percentage points higher than that of DeepNovo,which adopts Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM)network.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.62