一种具有缺失数据的无监督ReliefF特征选择算法  被引量:4

Unsupervised ReliefF Feature Selection Algorithm with Missing Data

在线阅读下载全文

作  者:薛露宇 宋燕[2] XUE Lu-yu;SONG Yan(College of Science,University of Shanghai for Science and Technology,Shanghai 200093,China;Control Science and Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)

机构地区:[1]上海理工大学理学院 [2]上海理工大学光电信息与计算机工程学院

出  处:《小型微型计算机系统》2023年第7期1441-1448,共8页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(62073223)资助;中央军委装备发展部航天飞行动力学技术国防科技重点实验室项目(6142210200304)资助。

摘  要:目前,大多数特征选择算法是针对完整数据集的.而面对缺失及无标签数据集时,多数特征选择算法是无效的.为了解决缺失及无标签数据集的特征选择问题,本文提出了一种基于加权FCM,融合互信息同时交替更新特征权重的ReliefF算法(WFCM-IReliefF,Improved ReliefF Based on WFCM).首先,对均值预填补的完整数据集利用FCM算法进行无监督学习,从而找到样本近邻;其次,将ReliefF算法计算得到的特征权重代入加权FCM算法中,解决原始空间与特征空间的不同造成的聚类效果不佳的问题,通过加权FCM算法和ReliefF算法交替更新得到关键特征;再者,对特征选择后的数据集利用矩阵分解技术改善对缺失数据的预填补.最后,利用多个UCI公共数据集的对比实验,验证了本文提出的算法与其他对比算法相比有较为满意的效果.At present,most feature selection algorithms are for complete data sets,which are invalid when faced with missing or unlabeled data sets.To solve the above problems,an improved ReliefF algorithm based on weighted FCM,which simultaneously integrates mutual information and updates feature weights alternately is proposed in this paper.Firstly,aiming at missing labels,FCM algorithm is used to recover corresponding labels to find the sample nearest neighbors.Secondly,the feature weights calculated by ReliefF algorithm were embedded into FCM algorithm to solve the problem of poor clustering effect caused by the difference between original space and feature space,and the weighted FCM algorithm and ReliefF algorithm are updated alternately to obtain the key features.Furthermore,matrix factorization technique is used to improve the pre-filling of missing data after feature selection.Finally,a comparative experiment of multiple UCI public data sets verifies that the proposed algorithm is more satisfactory than other algorithms.

关 键 词:特征选择 矩阵分解 模糊C均值聚类 无监督学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象