一种新的基于多数据源的蛋白质复合物识别算法  被引量:2

A novel algorithm for identifying protein complex based on multiple data sources

在线阅读下载全文

作  者:胡伟[1] 汤希玮[1] 

机构地区:[1]湖南第一师范学院信息科学与工程系,湖南长沙410205

出  处:《计算机与应用化学》2014年第4期451-456,共6页Computers and Applied Chemistry

基  金:湖南省自然科学基金面上项目(13JJ6086);湖南省科技厅科技计划项目(2010GK3049;2011GK3138)

摘  要:蛋白质复合物是许多生物过程得以实现的基石。蛋白质相互作用数据中的假阳性和假阴性对各种识别蛋白质复合物的计算方法有不良影响。为了解决这一问题,1种新的蛋白质复合物识别算法(ICMDS,Identifying Complexes based on Multiple Data Sources)被提出。该方法整合基因表达谱、关键蛋白质信息和蛋白质相互作用3种生物数据进行蛋白质复合物的挖掘。首先,ICMDS重新定义了2个相互作用的蛋白质之间的功能相似性(FS,Functional Similarity)。然后,ICMDS选择已知的关键蛋白质作为种子构建蛋白质复合物。为了消除冗余的复合物,ICMDS算法也设计了冗余过滤子程序。另外,ICMDS也使用非关键蛋白质作为种子并将之扩展为蛋白质复合物。实验结果表明ICMDS识别蛋白质复合物的能力明显优于其他计算方法。Protein complexes are a cornerstone of many biological processes. The false positives and false negatives of the protein-protein interaction (PPI) data have bad effects to various computational methods for identifying protein complexes. To address this problem, a novel protein complex identifying algorithm (ICMDS, Identifying Complexes based on Multiple Data Sources) is proposed. ICMDS mines protein complexes via the integration of multiple biological resources including gene expression profiles, essential protein information and PPI data. First, ICMDS redefines the functional similarity between two interacting proteins. Second, CMBI selects the known essential proteins as seeds to build the protein complexes. A Redundancy-filtering procedure is performed to eliminate redundant complexes. Additionally, ICMDS also uses other proteins as seeds to expand protein complexes. The experimental results show that 1CMDS performs significantly better than the other computational approaches in terms of the identification of protein complexes.

关 键 词:蛋白质相互作用网络 关键蛋白质 基因表达谱 蛋白质复合物识别算法 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象