基于注意力机制的不完备多视图聚类算法

Incomplete multi-view clustering algorithm based on attention mechanism

作　　者：杨成昊胡节王红军[1] 彭博[1] YANG Chenghao;HU Jie;WANG Hongjun;PENG Bo(School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China)

机构地区：[1]西南交通大学计算机与人工智能学院,成都611756

出　　处：《计算机应用》2024年第12期3784-3789,共6页journal of Computer Applications

基　　金：国家自然科学基金资助项目(62276216);四川省重点研发计划项目(2023YFG0354);2023年西南交通大学国际学生教育管理研究项目(23LXSGL01)。

摘　　要：针对传统深度不完备多视图聚类算法中补全缺失视图数据的不确定性、嵌入学习鲁棒性的缺乏和模型泛化性低的问题,提出基于注意力机制的不完备多视图聚类算法(IMVCAM)。首先,通过K最近邻(KNN)算法补全了视图中缺失的数据,使训练数据具有互补性;其次,经过线性编码层后,将获得的嵌入通过注意力层,以提高嵌入的质量;最后,对每个视图训练得到的嵌入使用k均值聚类算法进行聚类,而视图的权重通过皮尔逊相关系数确定。在5个经典的数据集上的实验结果表明,在Fashion数据集上,IMVCAM取得最优的结果,相较于次优的深度安全不完整多视图聚类(DSIMVC)算法,在数据缺失率为0.1、0.3的情况下,IMVCAM的聚类准确率分别提升了2.85、4.35个百分点;此外,在Caltech101-20数据集上,IMVCAM相较于次优的基于自注意力融合的不完整多视图聚类算法(IMVCSAF),在数据缺失率为0.1、0.3的情况下的聚类准确率分别提升了7.68、3.48个百分点。所提算法能够有效应对多视图数据的不完备性和模型泛化性问题。In order to solve the problems of uncertainty in completing missing view data,lack of robustness of embedding learning and low model generalization in traditional deep incomplete multi-view clustering algorithms,an Incomplete Multi-View Clustering algorithm based on Attention Mechanism(IMVCAM)was proposed.Firstly,K-Nearest Neighbors(KNN)algorithm was used to complete the missing data in the view,making the training data complementary.Then,after passing the linear encoding layer,the obtained embedding was passed through the attention layer to improve the quality of the embedding.Finally,the embedding obtained from the training of each view was clustered using k-means clustering algorithm,and the weights of the views were determined by the Pearson correlation coefficient.Experimental results on five classic datasets show that,the optimal result was achieved by IMVCAM on Fashion dataset,compared with the sub-optimal Deep Safe Incomplete Multi-View Clustering(DSIMVC)algorithm,IMVCAM improves the clustering accuracy by 2.85 and 4.35 percentage points respectively when the data missing rate is 0.1 and 0.3.Besides,on Caltech101-20 dataset,the clustering accuracy of IMVCAM is increased by 7.68 and 3.48 percentage points respectively compared to that of the sub-optimal algorithm IMVCSAF(Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion)when the missing rate is 0.1 and 0.3.The proposed algorithm can effectively deal with the incompleteness of multi-view data and the problem of model generalization.

关键词：不完备多视图聚类 K最近邻算法注意力机制 K均值聚类算法皮尔逊相关系数

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于注意力机制的不完备多视图聚类算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于注意力机制的不完备多视图聚类算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索