基于预训练和多模态融合的假新闻检测被引量：1

Fake News Detection Based on Pre-Training and Multi-Modal Fusion

作　　者：周昊玮刘勇[1] 玄萍[1,2] ZHOU Haowei;LIU Yong;XUAN Ping(School of Computer Science and Technology,Heilongjiang University,Harbin 150080,Heilongjiang,China;Department of Computer Science and Technology,School of Engineering,Shantou University,Shantou 515063,Guangdong,China)

机构地区：[1]黑龙江大学计算机科学与技术学院,黑龙江哈尔滨150080 [2]汕头大学工学院计算机科学与技术系,广东汕头515063

出　　处：《计算机工程》2024年第1期289-295,共7页Computer Engineering

基　　金：国家自然科学基金(61972135);黑龙江省自然科学基金(LH2020F043)。

摘　　要：现有的多模态检测模型通常对每个模态的特征进行简单拼接,不能对模态之间的相关性进行有效建模,而且很难迁移到标签稀少的领域。提出一种基于预训练和多模态融合的假新闻检测模型PMFD。提取新闻附带图像不同区域的特征作为图像原始向量,合并图像原始向量作为图像引导向量,设计早期融合、中期融合、后期融合3种不同的多模态融合方式。在早期融合阶段,通过图像引导向量初始化文本特征提取器,获取文本原始向量,合并文本原始向量作为文本引导向量。在中期融合阶段,使用模态的原始向量集合与其他模态的引导向量构造模态的特征表示。在后期融合阶段,融合不同模态的特征表示,构造新闻的特征表示。为提高模型的泛化能力,在标签丰富的数据上对PMFD进行预训练,然后再在标签稀少的数据上对PMFD进行微调。在公开数据集上的实验结果表明,PMFD能有效检测假新闻结果,相对传统模型CNN、LSTM、BERT等有10%以上的提升,相对EANN、M_model多模态假新闻检测模型有2%~3%的提升。Existing multi-modal detection models are typically characterized by a simple splicing of features from each modality and are often ineffective in modeling the correlation between modalities.Furthermore,the migration of these models to domains with sparse labels is challenging.In this paper,a PMFD model,based on pre-training and multi-modal fusion,is proposed.Initially,image raw vectors are extracted from different regions of news incidental images,which are then merged to form image guide vectors.Three distinct multimodal fusion methods are designed:early fusion,middle fusion,and post fusion.During early fusion,the text feature extractor is initialized with image bootstrap vectors,leading to the acquisition of text original vectors,which are subsequently merged into text bootstrap vectors.In the middle fusion stage,the feature representation of the modality is constructed using the modality's original vectors combined with the bootstrap vectors of other modalities.For post fusion,the feature representations of different modalities are fused to construct the feature representation of news.To enhance the model's generalization capability,PMFD is initially pre-trained on label-rich data and then fine-tuned on label-sparse data.Experimental results on public data set show that,this approach demonstrates an improvement of over 10%compared to traditional models,including CNN,LSTM,and BERT,and a 2%-3%enhancement over existing EANN,M_model multi-modal fake news detection models.

关键词：假新闻检测预训练多模态融合引导向量跨模态共享特征阶段融合

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于预训练和多模态融合的假新闻检测被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于预训练和多模态融合的假新闻检测 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于预训练和多模态融合的假新闻检测被引量：1