融合K-means聚类和标记相关性的多标记Relief特征选择  

Multilabel Relief-Based Feature Selectionvia Fusing K-Means Clusteringand Label Correlation

在线阅读下载全文

作  者:丰昌武 孙林 FENG Changwu;SUN Lin(College of Artificial Intelligence,Tianjin University of Science& Technology,Tianjin300457,China)

机构地区:[1]天津科技大学人工智能学院,天津300457

出  处:《聊城大学学报(自然科学版)》2025年第1期122-134,共13页Journal of Liaocheng University:Natural Science Edition

基  金:国家自然科学基金项目(61772176)资助

摘  要:现有Relief算法在利用标记相关性方面存在不足,忽视了局部标记相关性所提供的宝贵信息。针对这一问题,提出了一种融合K-means聚类与标记相关性的多标记Relief特征选择方法。首先,为充分考虑样本标记相关性,采用K-means聚类算法对样本进行聚类,将其划分到不同的簇中,从而构建样本的局部标记空间。其次,定义了所有样本在特征上的欧式距离,以此衡量样本的全局标记相关性。同时,改进了传统的余弦相似度,使用L1范数的平方根进行优化,并在局部标记空间中应用改进的余弦相似度,以有效获取样本的局部标记相关性。最后,在Relief算法的基础上,融合了样本的全局标记相关性与局部标记相关性,以此作为衡量样本相似度的依据,进而判别最近邻同类样本与最近邻异类样本,最终获得特征权重。为评估所提算法的性能,在10个多标记数据集上进行了对比测试,实验结果证明,与其他多标记特征选择算法相比,本算法具有显著优势。Existing Relief algorithms are delicient in exploiting label correlation and often ignore the valua ble information provided by local label correlation.To address this problem,this work proposed a multi label Relief feature selection method that fuses K-means clustering with label correlation.First,to fully consider the relevance of sample labels,the K-means clustering algorithm was used to cluster the samples and divide them into different clusters,thereby constructing the local label space of the samples,Second.the Euclidean distance of all samples is defined to measure the global labeling correlation of the samples.At the same time,the traditional cosine similarity was improved by using the square root of the Ll norm for optimization,and this improved cosine similarity was applied in the local label space to efficiently ob tain the local label correlation of the samples.Finally,on the basis of the Relief algorithm,the global la-bel correlation and local label correlation of the samples were fused,as the basis for measuring the similar ity of the samples,Then,the nearest-neighbor similar samples and nearest-neighbor dissimilar samples were discriminated to finally obtain the feature weights,To verify the effectiveness of the proposed algo rithm,comparison tests were conducted on 10 publicly available multilabel datasets,and the experimental results proved that the proposed algorithm shown significant advantages over other multilabel feature selection algorithms.

关 键 词:多标记学习 特征选择 K-MEANS聚类 标记相关性 RELIEF算法 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP311.13[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象