检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]吉林大学计算机科学与技术学院,长春130012
出 处:《南京大学学报(自然科学版)》2012年第1期63-69,共7页Journal of Nanjing University(Natural Science)
基 金:吉林省科技发展项目(20090501)
摘 要:近年来多源数据融合成为蛋白质功能预测的一个热点,本文提出一种基于Choquet模糊积分的多源数据融合方法对酵母蛋白进行预测.文中采用支持向量机做基础分类器对各个数据源进行预测,输出概率形式的结果.使用粒子群算法确定模糊密度,基于Choquet模糊积分对每个数据源的结果进行融合.实验表明Choquet模糊积分蛋白质功能预测结果要明显优于传统的加权平均法、支持向量机方法和K近邻方法.Predicting the function of protein is one of the main issues in the post-genomic period and the availability of large amounts of biological data makes it can be achieved.But in many cases the biological data obtained through biotechnology have a high degree of noise and generally a single data source can only provide useful information for a subset of the protein function classes.So data fusion using diverse biological data to predict the protein function arouses general interest in recent years.Compare with the common information fusion method of weighted average,fuzzy measure can reflect not only the importance of different objects,but also the interactions among objects.So in this paper,Choquet fuzzy integral fusion based on fuzzy measure is used to integrate the probabilistic outputs of different classifiers.And the particle swarm algorithm is adopted to search the optimized values of fuzzy density which is crucial for the fuzzy integral. Six data sets are used in this paper.The first five data sets are collected from the open database or calculated by the software of the open database and the last one is the union of the first five.Then the probabilistic support vector machines as base learners are applied to predict the functions of examples from each data set.The Choquet fuzzy integral method which based on the first five data sets' probabilistic outputs of the base learners will be applied.Comparison is made among the Choquet fuzzy integral method,weighted average method,support vector machines method and K nearest neighbors method.The performances of these methods are compared using ten-fold cross-validation techniques.The experimental results show that the Choquet fuzzy integral method performs much better and the data fusion methods which combine multiple types of biological data can substantially improve the results.
关 键 词:CHOQUET模糊积分 数据融合 蛋白质功能预测
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222