基于相关性分析的多标签特征选择方法  被引量:6

Multi-label feature selection method based on correlation analysis

在线阅读下载全文

作  者:王进[1] 孙万彤 WANG Jin;SUN Wantong(Key Laboratory of Data Engineering and Visual Computing,Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R.China)

机构地区:[1]重庆邮电大学数据工程与可视计算重点实验室,重庆400065

出  处:《重庆邮电大学学报(自然科学版)》2021年第6期1024-1037,共14页Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)

基  金:国家自然科学基金(61806033)。

摘  要:针对现有大多数多标签特征选择算法未能有效去除特征空间冗余特征,同时也忽略了标签差异性的现状,提出一种基于相关性分析的多标签特征选择方法,利用特征之间的相关度对特征进行分组,解决了特征之间的相关性问题。根据样本所对应的标签属性对样本做一个正负类的聚类,对于正样本和负样本所构成的正类簇和负类簇单独确定其聚类个数,并计算原特征到正负类簇中各个类中心的距离,如此便产生了标签特定特征空间;将标签共享的特征空间和标签特定特征空间融合,考虑到多个标签之间的个性和关联性,解决了标签的差异性问题。实验测试表明,相较于现有的多标签特征选择算法,提出的基于相关性分析的多标签特征选择方法在各个分类指标上均有较优的表现,充分证明了该方法的有效性。Most existing multi-label feature selection algorithms fail to effectively remove redundant features in the feature space,and also ignore the difference of labels.A multi-label feature selection method based on correlation analysis is proposed.The correlation between features is used to group features,which solves the problem of correlation between features.According to the label attribute corresponding to the sample,a positive and negative cluster for the sample is made.The number of clusters is determined separately for the positive and negative clusters composed of positive and negative samples,and the distance between the original feature and the center of each class in the positive and negative clusters is calculated.In this way,a specific feature space of the label is generated.Finally,the tag shared feature space and tag specific feature space are fused,and the problem of tag difference is solved considering the personality and relevance among multiple tags.After experimental tests on 9 multi-label data sets,compared with the existing multi-label feature selection algorithms,the proposed multi-label feature selection method based on correlation analysis has better performance in each classification index,which fully proves the effectiveness of this method.

关 键 词:机器学习 多标签学习 特征选择 关联性分析 特征空间融合 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象