一种基于正则化的半监督多标记学习方法  被引量:19

Regularized Semi-Supervised Multi-Label Learning

在线阅读下载全文

作  者:李宇峰[1] 黄圣君[1] 周志华[1] 

机构地区:[1]计算机软件新技术国家重点实验室(南京大学),南京210093

出  处:《计算机研究与发展》2012年第6期1272-1278,共7页Journal of Computer Research and Development

基  金:国家自然科学基金项目(61073097;61021062);江苏省自然科学基金项目(BK2008018);国家"九七三"重点基础研究发展计划基金项目(2010CB327903)

摘  要:多标记学习主要用于解决单个样本同时属于多个类别的问题.传统的多标记学习通常假设训练数据集含有大量有标记的训练样本.然而在许多实际问题中,大量训练样本中通常只有少量有标记的训练样本.为了更好地利用丰富的未标记训练样本以提高分类性能,提出了一种基于正则化的归纳式半监督多标记学习方法——MASS.具体而言,MASS首先在最小化经验风险的基础上,引入两种正则项分别用于约束分类器的复杂度及要求相似样本拥有相似结构化多标记输出,然后通过交替优化技术给出快速解法.在网页分类和基因功能分析问题上的实验结果验证了MASS方法的有效性.Multi-label learning is proposed to deal with examples which are associating with multiple class labels simultaneously. Previous multi-label studies usually assume that large amounts of labeled training examples are available to obtain good performance. However, in many real world applications, labeled examples are few and amounts of unlabeled examples are readily available. In order to exploit the abundant unlabeled examples to help improve the generalization performance, we propose a novel regularized inductive semi-supervised multi-label method named MASS. Specifically, aside from minimizing the empirical risk, MASS employs two regularizers to constrain the final decision function. One is to characterize the classifier's complexity with consideration of label relatedness, and the other requires that similar examples share with similar structural multi-label outputs. This leads to a large scale convex optimization problem, and an efficient alternating optimization algorithm is provided to achieve its global optimal solution in super-linear convergence rate due to the strong convexity of the objective function. Comprehensive experimental results on two real-world data sets, i. e. , webpage categorization and gene functional analysis with varied numbers of labeled examples, demonstrate the effectiveness of the proposal.

关 键 词:机器学习 多标记学习 半监督学习 网页分类 基因功能分析 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP391.41[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象