检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]西南大学计算机与信息科学学院,重庆400715 [2]北京建筑大学电气与信息工程学院,北京100044
出 处:《中国科学:信息科学》2017年第10期1349-1368,共20页Scientia Sinica(Informationis)
基 金:国家自然科学基金(批准号:61402378;61571163;61532014;61671189);重庆市研究生科研创新项目(批准号:CYS16070);重庆市基础与前沿研究计划项目(批准号:cstc2014jcyj A40031;cstc2016jcyj A0351);中央高校基本科研业务费(批准号:2362015XK07;XDJK2016B009;XDJK2017D061)资助项目
摘 要:蛋白质是生命活动的重要物质基础,对其功能的准确标注可以极大地促进生命科学的研究与发展.已有的蛋白质功能预测方法通常仅关注利用蛋白质具有某些功能的信息(正样例),并没有关注利用蛋白质不相关的功能信息(负样例).已有研究表明,结合蛋白质负样例可以降低蛋白质功能预测的复杂度并提高预测精度.本文提出一种基于降维的蛋白质不相关功能预测方法 (predicting irrelevant functions of proteins based on dimensionality reduction,IFDR).IFDR通过在蛋白质互作网邻接矩阵和蛋白质–功能标记关联矩阵上分别进行随机游走,挖掘蛋白质之间的内在关系和预估蛋白质的缺失功能标记,再分别利用奇异值分解将上述2个矩阵投影降维为低维实数矩阵,最后利用半监督回归预测负样例.在酵母菌、人类和拟南芥的蛋白质数据集上的实验表明,IFDR比已有相关算法能够更准确地预测负样例,对互作网络和功能标记空间的降维均可以提高负样例预测精度.Proteins are the foundation for many life processes and accurately annotating their biological functions can significantly boost the development of life sciences. Current function prediction models focus on employing the knowledge that proteins perform specific functions(positive examples),but ignore the knowledge that some functions are irrelevant for a protein(negative examples). Recent research indicates that incorporating negative examples can reduce the complexity and improve the accuracy of protein function prediction. In this paper,we propose an approach for predicting irrelevant functions of proteins based on dimensionality reduction(IFDR).Initially,IFDR performs random walks through matrices in a protein-protein interactions(PPI) network,as well as the corresponding protein-function association matrices,in order to explore the underlying relationships between proteins and model the missing functional annotations of proteins. Next,IFDR uses single value decomposition to project these matrices into low-dimensional numerical matrices. Finally,IFDR uses semi-supervised regression to predict negative examples of proteins. Experiments on S. cerevisiae,H. sapiens,and A. thaliana data demonstrate that IFDR can more accurately predict negative examples when compared to related methods. Dimensionality reduction in the network space and label space can both improve the accuracy of negative example prediction.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.189