字典学习与样本关联保持结合的无监督特征选择模型  

Unsupervised feature selection model with dictionary learning and sample correlation preservation

在线阅读下载全文

作  者:刘晶鑫 黄雯静 徐亮胜 黄冲[3] 吴建生[1] LIU Jingxin;HUANG Wenjing;XU Liangsheng;HUANG Chong;WU Jiansheng(School of Mathematics and Computer Sciences,Nanchang University,Nanchang Jiangxi 330031,China;Northern Lianchuang Communication Company Limited,Nanchang Jiangxi 330096,China;Information Office,Nanchang University,Nanchang Jiangxi 330031,China)

机构地区:[1]南昌大学数学与计算机学院,南昌330031 [2]北方联创通信有限公司,南昌330096 [3]南昌大学信息化办公室,南昌330031

出  处:《计算机应用》2024年第12期3766-3775,共10页journal of Computer Applications

基  金:国家自然科学基金资助项目(62066027);江西省自然科学基金资助项目(20212BAB212011);江西省研究生创新基金资助项目(YC2022-s160)。

摘  要:针对大多数基于字典学习的无监督特征选择模型没有充分挖掘数据间的本质关联,进而降低了特征重要性判断的准确性这一问题,提出一种字典学习与样本关联保持结合的无监督特征选择模型(DLSCP)。首先,从数据中学习字典基以完成对原始数据的编码,并在字典空间中获得能够反映数据分布的隐表示;其次,进一步在字典空间中自适应地学习数据间的本质关联,以消除冗余特征和噪声特征的影响,从而获得准确的数据间的局部几何结构;最后,利用数据间的本质关联评估数据特征的关联性和重要性。在TOX数据集上的实验结果表明,当选择50个特征时,DLSCP在归一化互信息(NMI)和聚类准确度(Acc)这2个评价指标上,相较于非负谱分析模型NDFS(Nonnegative Discriminative Feature Selection)分别提升了13.33和7.95个百分点,相较于隐空间嵌入无监督特征选择模型LSEUFS(Latent Space Embedding for Unsupervised Feature Selection via joint dictionary learning)分别提升了15.74和7.31个百分点,验证了DLSCP的有效性。Focusing on the issue that most unsupervised feature selection models based on dictionary learning cannot fully exploit the intrinsic correlations among data,which reduces the accuracy of feature importance judgment,an unsupervised feature selection model with Dictionary Learning and Sample Correlation Preservation(DLSCP)was proposed.Firstly,the original data were encoded by learning the dictionary atoms,and the latent representations to characterize data distribution were obtained in the dictionary space.Secondly,the intrinsic correlations among data were learned adaptively in the dictionary space to alleviate the influence of redundant and noisy features,thus obtaining accurate local structure among data.Finally,the intrinsic correlations among data were used to measure the relevance and importance of data features.Experimental results on TOX dataset show that,when selecting 50 features,DLSCP improves the Normalized Mutual Information(NMI)and clustering Accuracy(Acc)by 13.33 and 7.95 percentage points respectively compared to non negative spectral analysis model NDFS(Nonnegative Discriminative Feature Selection)and by 15.74 and 7.31 percentage points respectively compared to unsupervised feature selection model with hidden space embedding LSEUFS(Latent Space Embedding for Unsupervised Feature Selection via joint dictionary learning),which verifies the effectiveness of DLSCP.

关 键 词:无监督特征选择 字典学习 自适应图学习 样本关联保持 相似度矩阵 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象