检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《智能系统学报》2008年第1期83-90,共8页CAAI Transactions on Intelligent Systems
基 金:国家“863”基金资助项目(2006AA10Z313);国家自然科学基金资助项目(60773206/F020106,60704047/F030304);国防应用基础研究基金资助项目(A1420461266);教育部跨世纪优秀人才支持计划基金资助项目(NCET-04-0496);教育部科学研究重点基金资助项目(105087)
摘 要:传统的机器学习主要解决单标记学习,即一个样本仅有一个标记.在生物信息学中,一个基因通常至少具有一个功能,即至少具有一个标记,与传统学习方法相比,多标记学习能更有效地识别生物相关基因组的功能.目前的研究主要集中在监督多标记学习算法.然而,研究半监督多标记学习算法,从已标记和未标记的基因表达数据中学习,仍然是未解决问题.提出一种有效的基因功能分析的半监督多标记学习算法SML_SVM.首先,SML_SVM根据PT4方法,将半监督多标记学习问题转化为半监督单标记学习问题,然后根据最大后验概率原则(MAP)和K近邻方法估计未标记样本的标记,最后,用SVM求解单标记学习问题.在yeast基因数据和genbase蛋白质数据上的实验表明,SML_SVM性能比基于PT4方法的MLSVM和自训练MLSVM更优.Conventional machine learning is used only for single label learning, implying that every sample has only one label. However, in bioinformatics, a gene has more than one function, so it needs more than one label. Therefore, multi-label learning is more effective for identifying gene groups than conventional learning approach. Current research mainly focuses on supervised multi-label learning. The problem of effective semi-supervised multi-label learning strategies for labeled examples and unlabeled examples of gene expression datasets still remains unsolved. In this paper, a semi-supervised multi-label learning algorithm, named SML_SVM, is presented as an effective multi-label learner for analysis of gene expressions with at least one function. First, the proposed SML_SVM algorithm transforms the semi-supervised multi-label learning into corresponding semi-supervised single-label learning by the PT4 method, then it labels unlabeled examples using the maximum a posteriori (MAP) principle in combination with the K-nearest neighbor method, and finally, it solves the corresponding single-label learning problem using SVM. The distinctive characteristic of the proposed algorithm is its efficient integration of SVM-based single-label learning with MAP and K-nearest neighbor methods. Experimental results with a real Yeast gene expression dataset and a Genbase protein dataset show that the proposed SML SVM algorithm outperforms the PT4- based MLSVM method and self-training MLSVM.
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145