基于正负样例的蛋白质功能预测  被引量:6

Protein Function Prediction Using Positive and Negative Examples

在线阅读下载全文

作  者:傅广垣 余国先[1] 王峻[1] 郭茂祖[2] 

机构地区:[1]西南大学计算机与信息科学学院,重庆400715 [2]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001

出  处:《计算机研究与发展》2016年第8期1753-1765,共13页Journal of Computer Research and Development

基  金:国家自然科学基金项目(61402378;61571163;61532014);重庆市基础与前沿研究项目(cstc2014jcyjA40031;cstc2016jcyjA0351);重庆市研究生科研创新项目(CYS16070);中央高校基本科研业务费基金项目(2362015XK07;XDJK2016B009;XDJK2016D021)~~

摘  要:蛋白质功能预测是后基因组时代生物信息学的核心问题之一.蛋白质功能标记数据库通常仅提供蛋白质具有某个功能(正样例)的信息,极少提供蛋白质不具有某个功能(负样例)的信息.当前的蛋白质功能预测方法通常仅利用蛋白质正样例,极少关注量少但富含信息的蛋白质负样例.为此,提出一种基于正负样例的蛋白质功能预测方法(protein function prediction using positive and negative examples,ProPN).ProPN首先通过构造一个有向符号混合图描述已知的蛋白质与功能标记的正负关联信息、蛋白质之间的互作信息和功能标记间的关联关系,再通过符号混合图上的标签传播算法预测蛋白质功能.在酵母菌、老鼠和人类蛋白质数据集上的实验表明,ProPN不仅在预测已知部分功能标记蛋白质的负样例任务上优于现有算法,在预测功能标记完全未知蛋白质的功能任务上也获得了较其他相关方法更高的精度.Predicting protein function is one of the key challenges in the post genome era.Functional annotation databases of proteins mainly provide the knowledge of positive examples that proteins carrying out a given function,and rarely record the knowledge of negative examples that proteins not carrying out a given function.Current computational models almost only focus on utilizing the positive examples for function prediction and seldom pay attention to these scarce but informative negative examples.It is well recognized that both positive and negative examples should be used to achieve a discriminative predictor.Motivated by this recognition,in this paper,we propose a protein function prediction approach using positive and negative examples(ProPN)to bridge this gap.ProPN first utilizes a direct signed hybrid graph to describe the positive examples,negative examples,interactions between proteins and correlations between functions;and then it employs label propagation on the graph to predict protein function.The experimental results on several public available proteomic datasets demonstrate that ProPN not only makes better performance in predicting negative examples of proteins whose functional annotations are partially known than state-of-the-art algorithms,but also performs better than other related approaches in predicting functions of proteins whose functional annotations are completely unknown.

关 键 词:蛋白质功能预测 正样例 负样例 符号混合图 标签传播 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象