基于语义最优化的图像聚类算法  

Image clustering algorithm based on semantic optimization

在线阅读下载全文

作  者:张凯 宋承云 ZHANG Kai;SONG Chengyun(College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China)

机构地区:[1]重庆理工大学计算机科学与工程学院,重庆400054

出  处:《计算机应用》2023年第S02期117-121,共5页journal of Computer Applications

基  金:重庆理工大学研究生教育高质量发展行动计划项目(gzlcx20232065)。

摘  要:针对深度聚类中采用对比学习方式得到的语义特征信息不足的问题,提出一种优化语义特征的算法。在预训练阶段,采用重构损失作为正则化项,增加特征表示和输入之间的互信息,从而近似引入更多与聚类任务相关的信息,降低对比学习过拟合共享信息的风险;在微调阶段,抛弃传统的聚类算法与聚类网络同时更新的方式,采用图像近邻之间的相似性差异作为损失更新聚类网络,以最大限度地利用图像之间的近邻语义信息。在CIFAR10、CIFAR100和STL10数据集上的实验结果表明,所提算法在STL10数据集上的准确率比次优的SCAN(Semantic Clustering by Adopting Nearest neighbors)算法提高了2.7个百分点,并且在标准化互信息(NMI)和调整兰德系数(ARI)指标上均取得了领先,验证了所提算法的有效性。Aiming at the problem of insufficient information of semantic features obtained by using contrastive learning in deep clustering,an algorithm for optimizing semantic features was proposed.In the pre-training stage,Reconstruction loss was used as a regularization term to increase the mutual information between the feature representation and the input,thus approximating the introduction of more information relevant to the clustering task and reducing the risk of overfitting shared information by contrastive learning.In the fine-tuning stage,the traditional method that the clustering algorithm and the clustering network were updated simultaneously was abandoned,and the similarity difference between the nearest neighbors of the image was used as the loss to update the clustering network to maximize the use of the semantic information of the nearest neighbors of the image.Experiments results on the CIFAR10,CIFAR100 and STL10 datasets show that the proposed algorithm improves the accuracy on the STL10 dataset by 2.7 percentage points compared to the suboptimal SCAN(Semantic Clustering by Adopting Nearest neighbors)algorithm,and achieves a lead in both the Normalized Mutual Information(NMI)and Adjusted Rand Index(ARI)metrics,which validates the effectiveness of the proposed algorithm.

关 键 词:深度聚类 对比学习 语义特征 过拟合 正则化 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象