基于文本-视觉和信息熵最小化的对比学习模型

Contrastive Learning Model Based on Text-Visual and Information Entropy Minimization

作　　者：蔡晓东[1] 董丽芳黄业洋周丽 CAI Xiaodong;DONG Lifang;HUANG Yeyang;ZHOU Li(School of Information and Communication,Guilin University of Electronic Technology,Guilin 541004,Guangxi,China;Nanning West Bund Fenggu Business Data Co.,Ltd.,Nanning 530008,Guangxi,China)

机构地区：[1]桂林电子科技大学信息与通信学院,广西桂林541004 [2]南宁西岸枫谷商务数据有限公司,广西南宁530008

出　　处：《华南理工大学学报(自然科学版)》2025年第3期50-56,共7页Journal of South China University of Technology(Natural Science Edition)

基　　金：广西创新驱动发展专项(AA20302001)。

摘　　要：当前的无监督对比学习方法主要依赖纯文本信息来构建句子嵌入,在全面理解句子所表达的深层含义时存在局限性。同时,传统的对比学习方法过于注重最大化文本正实例之间的互信息,忽视了句子嵌入中潜在的噪声干扰。为了既能保留文本中的有用信息,又能有效地剔除文本嵌入中的噪声干扰,该文提出了一种基于文本-视觉和信息熵最小化的对比学习模型。首先,将文本与对应的视觉信息在对比学习的框架下进行深度融合,共同映射到一个统一的地面空间,并确保它们的表示在该空间中保持一致,从而克服了仅依赖纯文本信息进行句子嵌入学习的限制,使得对比学习过程更加全面且精确;然后,遵循信息最小化原则,在最大化文本正实例间互信息的同时,基于信息熵最小化对文本正实例进行重构。在标准语义文本相似度(STS)任务上的实验结果表明,所提出的模型在Spearman相关系数评价指标上取得了显著提升,相较于现有先进方法具有显著的优势,同时也证明了该模型的有效性。Current unsupervised contrastive learning methods mainly rely on pure textual information to construct sentence embeddings,which presents limitations in comprehensively understanding the deeper meanings conveyed by sentences.Meanwhile,traditional contrastive learning methods focus excessively on maximizing the mutual information between positive instances of text,overlooking the potential noise interference within sentence embeddings.To effectively retain useful information in the text while eliminating noise interference in the embeddings,the paper proposed a contrastive learning model based on text-vision and information entropy minimization.Firstly,the text and the corresponding visual information are deeply fused under the framework of contrastive learning,and jointly mapped to a unified grounding space,ensuring their representations remain consistent within this space.This approach overcomes the limitations of relying solely on pure textual information for sentence embedding learning,making the contrastive learning process more comprehensive and precise.Secondly,following the principle of information minimization,the model reconstructs positive text instances based on information entropy minimization while maximizing mutual information between positive text instances.Experimental results on the standard semantic textual similarity(STS)task demonstrate that the proposed model achieves significant improvements in the Spearman correlation coefficient evaluation metric,showcasing a notable advantage over existing state-of-the-art methods.This also confirms the effectiveness of the proposed model.

关键词：无监督对比学习互信息文本-视觉信息熵最小化语义文本相似度

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于文本-视觉和信息熵最小化的对比学习模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于文本-视觉和信息熵最小化的对比学习模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索