联合数据增强的语义对比聚类被引量：3

Semantic Contrastive Clustering with Federated Data Augmentation

作　　者：王气洪贾洪杰黄龙霞毛启容[1] Wang Qihong;Jia Hongjie;Huang Longxia;Mao Qirong(School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang,Jiangsu 212013)

机构地区：[1]江苏大学计算机科学与通信工程学院,江苏镇江212013

出　　处：《计算机研究与发展》2024年第6期1511-1524,共14页Journal of Computer Research and Development

基　　金：国家自然科学基金项目(61906077,62102168,62176106,U1836220);江苏省自然科学基金项目(BK20190838,BK20200888);中国博士后科学基金项目(2020T130257,2020M671376);江苏省博士后科学基金项目(2021K596C)。

摘　　要：鉴于对比学习在下游任务中的优异表现,对比聚类的研究受到广泛关注.但是,大部分方法只采用一类简单的数据增强技术,尽管增强后的视图保留了原始样本的大部分特征信息,但也继承了语义信息和非语义信息相融交织的特性,在相似或相同的视图模式下,该特性限制了模型对语义信息的学习.有些方法直接将来源于同一样本的具有相同视图模式的2个数据增强视图组成正样本对,导致样本对语义性不足.为解决上述问题,提出基于联合数据增强的语义对比聚类方法,基于一强一弱2类数据增强,利用视图间的差异降低非语义信息的干扰,增强模型对语义信息的感知能力.此外,基于全局k近邻图引入全局类别信息,由同一类的不同样本形成正样本对.在6个通用的挑战性数据集上的实验结果表明该方法取得了最优的聚类性能,证实了所提方法的有效性和优越性.Given the excellent performance of contrastive learning on downstream tasks,contrastive clustering has received much more attention recently.However,most approaches only utilize a simple kind of data augmentation.Although augmented views keep the majority of information from original samples,they also inherit a mixture of characteristic of features,including semantic and non-semantic features,which limits model’s learning ability of semantic information under similar or identical view patterns.Even some approaches regard two different augmentation views being from the same sample and keeping similar view patterns as positive pairs,which results in sample pairs lacking of semantics.In this paper,we propose a semantic contrastive clustering method with federated data augmentation to solve these problems.Two different types of data augmentations,namely strong data augmentation and weak data augmentation,are introduced to produce two very different view patterns.These two view patterns are utilized to mitigate the disturbance of non-semantic information and improve the semantic awareness of the proposed approach.Moreover,a global k-nearest neighbor graph is used to bring global category information,which instructs the model to treat different samples from the same cluster as positive pairs.Extensive experiments on six commonly used and challenging image datasets show that the proposed method achieves the state-of-the-art performance and confirms the superiority and validity of it.

关键词：强数据增强弱数据增强对比学习全局类别信息聚类

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

联合数据增强的语义对比聚类被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

联合数据增强的语义对比聚类 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

联合数据增强的语义对比聚类被引量：3