基于自注意力融合的不完整多视图聚类算法  被引量:2

Incomplete multi-view clustering algorithm based on self-attention fusion

在线阅读下载全文

作  者:李顺勇[1] 李师毅 胥瑞 赵兴旺[2] LI Shunyong;LI Shiyi;XU Rui;ZHAO Xingwang(School of Mathematical Sciences,Shanxi University,Taiyuan Shanxi 030006,China;School of Computer and Information Technology(School of Big Data),Shanxi University,Taiyuan Shanxi 030006,China)

机构地区:[1]山西大学数学科学学院,太原030006 [2]山西大学计算机与信息技术学院(大数据学院),太原030006

出  处:《计算机应用》2024年第9期2696-2703,共8页journal of Computer Applications

基  金:国家自然科学基金资助项目(61976128,62072293);山西省基础研究计划项目(202303021221054);山西省研究生教育教学改革课题(2022YJJG010)。

摘  要:基于不完整数据的多视图聚类任务已经成为无监督学习领域的研究热点之一。然而大多数基于“浅层”模型的多视图聚类算法通常在面对大规模高维数据时难以提取和刻画视图内的潜在特征结构;同时,堆叠或求平均的多视图信息融合方式忽视了视图之间的差异性,没有充分考虑各视图对构建公共一致表示的不同贡献。针对以上问题,提出一种基于自注意力融合的不完整多视图聚类算法(IMVCSAF)。首先,基于深度自编码器提取各视图的潜在特征,并采用对比学习的方式最大化各视图间的一致性信息;其次,采用自注意力机制对各视图的潜在表示进行重新编码和融合,并全面考虑和挖掘不同视图之间的内在因果性和特征互补性;再次,基于公共一致表示对缺失实例样本的潜在表示进行预测和恢复,从而完整地实现多视图聚类的过程。在Scene-15、LandUse-21、Caltech101-20和NoisyMNIST数据集上的实验结果表明,IMVCSAF在满足收敛性要求的前提下得到的准确率均高于其他对比算法,而在50%缺失率的Noisy-MNIST数据集上,IMVCSAF的准确率比次优的COMPLETER(inCOMPlete muLti-view clustEring via conTrastivE pRediction)算法提高了6.58个百分点。Multi-view clustering task based on incomplete data has become one of the research hotspots in the field of unsupervised learning.However,most multi-view clustering algorithms based on“shallow”models often find it difficult to extract and characterize potential feature structures within views when dealing with large-scale high-dimensional data.At the same time,the stacking or averaging methods of multi-view information fusion ignore the differences between views and does not fully consider the different contributions of each view to building a common consensus representation.To address the above issues,an Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion(IMVCSAF)was proposed.Firstly,the potential features of each view were extracted on the basis of a deep autoencoder,and the consistency information among views was maximized by using contrastive learning.Secondly,a self-attention mechanism was adopted to recode and fuse the potential representations of each view,and the inherent causality as well as feature complementarity between different views was considered and mined comprehensively.Thirdly,based on the common consensus representation,the potential representation of missing instance was predicted and recovered,thereby fully implementing the process of multiview clustering.Experimental results on Scene-15,LandUse-21,Caltech101-20 and Noisy-MNIST datasets show that,the accuracy of IMVCSAF is higher than those of other comparison algorithms while meeting the convergence requirements.On Noisy-MNIST dataset with 50%miss rate,the accuracy of IMVCSAF is 6.58 percentage points higher than that of the second best algorithm—COMPETER(inCOMPlete muLti-view clustEring via conTrastivE pRediction).

关 键 词:多视图聚类 自注意力 互信息 表示学习 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象