基于双重索引矩阵的蛋白质功能预测  被引量:1

Protein function prediction based on doubly indexed matrix

在线阅读下载全文

作  者:孟军[1] 张信[1] 

机构地区:[1]大连理工大学计算机科学与技术学院,辽宁大连116024

出  处:《计算机应用》2015年第6期1637-1642,共6页journal of Computer Applications

基  金:国家自然科学基金资助项目(61472061)

摘  要:针对单一数据源预测蛋白质功能效果不佳以及蛋白质相互作用网络信息不完全等问题,提出一种多数据源融合和基于双重索引矩阵的随机游走的蛋白质功能预测(MSI-RWDIM)算法。该算法使用了蛋白质序列、基因表达和蛋白质相互作用数据预测蛋白质功能,并根据这些数据源特性构建相应的相互作用加权网络;然后融合各数据源加权网络并结合功能相关性网络构建双重索引矩阵,使用随机游走算法计算得分进而预测蛋白质功能。在酵母数据集的五折交叉验证中,MSI-RWDIM算法具有较高的准确率和较低的覆盖率,还可降低功能标签损失率。研究结果表明,MSI-RWDIM算法的总体性能优于常用的k-近邻、直推式多标签集成分类和快速同步加权方法。The single data source cannot effectively predict the function of protein and the information of protein interaction network is incomplete. In order to solve the problem, A Multi-Source Integration and Random Walk with Doubly Indexed Matrix (MSI-RWDIM) algorithm was proposed. The proposed algorithm used protein sequence, gene expression and protein-protein interaction for the prediction of protein function. The weighting networks were constructed from the data sources with their characteristics. A network, which was fused by the weighting networks, integrated with function correlation network to construct a doubly indexed matrix. Random walk was used to calculate annotation scores and predict protein function. The cross-validation experiments on Yeast show that MSI-RWDIM can achieve higher prediction accuracy, lower coverage and lower loss rate of function labels. The research results show that the overall performance of MSI-RWDIM is much better than commonly used k-nearest neighbor, transduetive multi-label ensemble classifier and fast simultaneous weighting method.

关 键 词:多数据源融合 随机游走 双重索引矩阵 功能相关性网络 蛋白质功能预测 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象