检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨成昊 胡节 王红军[1] 彭博[1] YANG Chenghao;HU Jie;WANG Hongjun;PENG Bo(School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China)
机构地区:[1]西南交通大学计算机与人工智能学院,成都611756
出 处:《计算机应用》2024年第12期3784-3789,共6页journal of Computer Applications
基 金:国家自然科学基金资助项目(62276216);四川省重点研发计划项目(2023YFG0354);2023年西南交通大学国际学生教育管理研究项目(23LXSGL01)。
摘 要:针对传统深度不完备多视图聚类算法中补全缺失视图数据的不确定性、嵌入学习鲁棒性的缺乏和模型泛化性低的问题,提出基于注意力机制的不完备多视图聚类算法(IMVCAM)。首先,通过K最近邻(KNN)算法补全了视图中缺失的数据,使训练数据具有互补性;其次,经过线性编码层后,将获得的嵌入通过注意力层,以提高嵌入的质量;最后,对每个视图训练得到的嵌入使用k均值聚类算法进行聚类,而视图的权重通过皮尔逊相关系数确定。在5个经典的数据集上的实验结果表明,在Fashion数据集上,IMVCAM取得最优的结果,相较于次优的深度安全不完整多视图聚类(DSIMVC)算法,在数据缺失率为0.1、0.3的情况下,IMVCAM的聚类准确率分别提升了2.85、4.35个百分点;此外,在Caltech101-20数据集上,IMVCAM相较于次优的基于自注意力融合的不完整多视图聚类算法(IMVCSAF),在数据缺失率为0.1、0.3的情况下的聚类准确率分别提升了7.68、3.48个百分点。所提算法能够有效应对多视图数据的不完备性和模型泛化性问题。In order to solve the problems of uncertainty in completing missing view data,lack of robustness of embedding learning and low model generalization in traditional deep incomplete multi-view clustering algorithms,an Incomplete Multi-View Clustering algorithm based on Attention Mechanism(IMVCAM)was proposed.Firstly,K-Nearest Neighbors(KNN)algorithm was used to complete the missing data in the view,making the training data complementary.Then,after passing the linear encoding layer,the obtained embedding was passed through the attention layer to improve the quality of the embedding.Finally,the embedding obtained from the training of each view was clustered using k-means clustering algorithm,and the weights of the views were determined by the Pearson correlation coefficient.Experimental results on five classic datasets show that,the optimal result was achieved by IMVCAM on Fashion dataset,compared with the sub-optimal Deep Safe Incomplete Multi-View Clustering(DSIMVC)algorithm,IMVCAM improves the clustering accuracy by 2.85 and 4.35 percentage points respectively when the data missing rate is 0.1 and 0.3.Besides,on Caltech101-20 dataset,the clustering accuracy of IMVCAM is increased by 7.68 and 3.48 percentage points respectively compared to that of the sub-optimal algorithm IMVCSAF(Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion)when the missing rate is 0.1 and 0.3.The proposed algorithm can effectively deal with the incompleteness of multi-view data and the problem of model generalization.
关 键 词:不完备多视图聚类 K最近邻算法 注意力机制 K均值聚类算法 皮尔逊相关系数
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30