检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:秦琪琪 丁学明[1] 王金雷 QIN Qi-qi;DING Xue-ming;WANG Jin-lei(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
机构地区:[1]上海理工大学光电信息与计算机工程学院,上海200093
出 处:《小型微型计算机系统》2023年第12期2692-2699,共8页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(11502145)资助。
摘 要:蛋白质功能的准确预测有利于推进生物医学发展,高通量测序技术的快速发展加快了蛋白质序列的提取速度,从而产生了大量未注释的蛋白质,并且新测序序列缺乏结构等生物信息,针对该问题提出了基于序列和组合图卷积网络的蛋白质功能预测模型(Protein Function Prediction using Sequences and Combined Graph Convolutional Networks, PFP-SCGCN).首先通过深度学习方法捕获蛋白质序列的多维特征信息,再通过多序列比对从蛋白质序列中提取进化耦合信息和氨基酸残基群落,然后利用进化耦合信息和氨基酸残基群落生成序列氨基酸之间两种不同连接程度的邻接矩阵,将这两种邻接矩阵与序列特征信息一起输入给组合图卷积网络进行信息融合,最后通过多个全连接层获得蛋白质功能类别信息.本文还通过分析PFP-SCGCN的特定网络层识别蛋白质功能位点,可帮助人们推测出新序列中的重要氨基酸.模型结果表明,PFP-SCGCN模型的功能预测准确率远高于对比方法,具有较好的鲁棒性,并且可以较准确的识别功能位点.Accurate prediction of protein function is beneficial to biomedical development.The rapid development of high-throughput sequencing technology accelerates the extraction speed of protein sequences,resulting in a large number of unannotated proteins,and the new sequences lack structure and other biological information.To solve this problem,this paper proposes a model of protein function prediction using sequences and combined graph convolutional networks(PFP-SCGCN).Firstly,the model captures the multi-dimensional feature information of the protein sequence by deep learning method,and extracts the evolutionary coupling information and residue communities by multi-sequence alignment.Then,two kinds of adjacency matrices with different degrees of connection between sequence amino acids are generated by using the evolutionary coupling information and residue communities.The two kinds of adjacency matrices and sequence feature information are input into the combined graph convolutional network for information fusion.Finally,protein functional class information is obtained through multiple fully connected layers.This paper also identifies protein functional sites by analyzing the specific network layer of PFP-SCGCN,which can help people to infer the important amino acids in the new sequence.The model results show that the accuracy of functional prediction of the PFP-SCGCN model is much higher than that of comparison methods,it has better robustness and could identify functional sites accurately.
关 键 词:蛋白质功能预测 功能位点 图卷积网络 蛋白质序列
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.226.88.23