基于深度神经网络的蛋白质相互作用预测框架  被引量:6

Prediction of protein-protein interactions based on deep neural networks

在线阅读下载全文

作  者:刘桂霞[1,2] 王沫沅 苏令涛 吴春国[1,2] 孙立岩 王荣全 LIU Gui-xia;WANG Mo-yuan;SU Ling-tao;WU Chun-guo;SUN Li-yan;WANG Rong-quan(College of Computer Science and Technology,Jilin University,Changchun 130012,China Symbol Computation and Knowledge Engineering of Ministry Education,Jilin University,Changchun 130012,China)

机构地区:[1]吉林大学计算机科学与技术学院,长春130012 [2]吉林大学符号计算与知识工程教育部重点实验室,长春130012

出  处:《吉林大学学报(工学版)》2019年第2期570-577,共8页Journal of Jilin University:Engineering and Technology Edition

基  金:国家自然科学基金项目(61772226;61373051;61502343);吉林省科技发展计划项目(20140204004GX)

摘  要:为解决实验方法中结果存在较高假阳性率和假阴性率的问题,整合蛋白质特征数据,提出一种基于深度神经网络的蛋白质相互作用预测框架。提取蛋白质的GO语义相似性、序列相似性、蛋白质重要性以及亚细胞定位信息,得到低维度的输入数据。然后建立深度神经网络,进行预测。通过使用弃权技术,减少网络中复杂的互适应神经元,总体性能得到提高。预测框架在酿酒酵母蛋白质数据集上的准确率达到95.67%,精确度达到96.38%。实验结果表明:提取的特征数据较适合用于蛋白质互作的预测研究,且构建的基于深度神经网络的蛋白质相互作用预测框架具有出色的泛化性能,在多种数据上都能取得较好效果。In order to deal with the high false-positive to false-negative rate in experimental methods,a Deep Neural Network(DNN)is constructed based on several biology features.Protein features,including GO term semantic similarity,sequence similarity,essentiality and subcellular localization information,are integrated from diverse databases to form a fixed-length eigenvector.This vector contains a great deal of related information and can be used as the input of a classifier to predict protein interactions.Then the DNN which is data driven is constructed.It is used to automatically learn information from the input data and predict whether the unknown protein pairs interact or not.Dropout is used during the training phase to prevent co-adaption and improve its performance.The method achieves a prediction accuracy of 95.67% with 96.38% precision on the S.cerevisae dataset.Experimental results show that the extracted features are suitable for the prediction of PPIs,and many commonly used machine learning models can predict interaction effectively and efficiently based on this eigenvector.Moreover the DNN has good generalization capacity and shows high performance on various feature data.

关 键 词:人工智能 蛋白质相互作用 蛋白质特征 蛋白质序列 深度神经网络 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象