基于类属属性的多标记学习算法  被引量:11

Label-Specific Features on Multi-Label Learning Algorithm

在线阅读下载全文

作  者:吴磊[1,2] 张敏灵[1,2] 

机构地区:[1]东南大学计算机科学与工程学院,江苏南京210096 [2]计算机网络和信息集成教育部重点实验室(东南大学),江苏南京210096

出  处:《软件学报》2014年第9期1992-2001,共10页Journal of Software

基  金:国家自然科学基金(61175049;61222309);教育部新世纪优秀人才支持计划(NCET-13-0130)

摘  要:在多标记学习框架中,每个对象由一个示例(属性向量)描述,却同时具有多个类别标记.在已有的多标记学习算法中,一种常用的策略是将相同的属性集合应用于所有类别标记的预测中.然而,该策略并不一定是最优选择,原因在于每个标记可能具有其自身独有的特征.基于这个假设,目前已经出现了基于标记的类属属性进行建模的多标记学习算法LIFT.LIFT包含两个步骤:属属性构建与分类模型训练.LIFT首先通过在标记的正类与负类示例上进行聚类分析,构建该标记的类属属性;然后,使用每个标记的类属属性训练对应的二类分类模型.在保留LIFT分类模型训练方法的同时,考察了另外3种多标记类属属性构造机制,从而实现LIFT算法的3种变体——LIFTMDDM,LIFT-INSDIF以及LIFT-MLF.在12个数据集上进行了两组实验,验证了类属属性对多标记学习系统性能的影响以及LIFT采用的类属属性构造方法的有效性.In the framework of multi-label learning, each example is represented by a single instance (feature vector) while simultaneously associated with multiple class labels. A common strategy adopted by most existing multi-label learning algorithms is that the very feature set of each example is employed in the discrimination processes of all class labels. However, this popular strategy might be suboptimal as each label is supposed to possess specific characteristics of its own. Based on this assumption, a multi-label learning algorithm named LIFT is proposed, in which label specific feature of each label is utilized in the discrimination process of the corresponding label. LIFT contains two steps:label-specific features construction and classification models induction. LIFT constructs the label-specific features by querying the clustering results and then induces the classification model with the corresponding label-specific features. In this paper, three variants of LIFT are studied, all employ other label-specific feature construction mechanisms while retaining the classification models induction process of LIFT. To validate the general helpfulness of label-specific feature mechanism to multi-label learning and the effectiveness of those label-specific features adopted by LIFT, two groups of experiments are conducted on a total of twelve multi-label benchmark datasets.

关 键 词:机器学习 多标记学习 类属属性 降维 标记相关性 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象