检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周昊玮 刘勇[1] 玄萍[1,2] ZHOU Haowei;LIU Yong;XUAN Ping(School of Computer Science and Technology,Heilongjiang University,Harbin 150080,Heilongjiang,China;Department of Computer Science and Technology,School of Engineering,Shantou University,Shantou 515063,Guangdong,China)
机构地区:[1]黑龙江大学计算机科学与技术学院,黑龙江哈尔滨150080 [2]汕头大学工学院计算机科学与技术系,广东汕头515063
出 处:《计算机工程》2024年第1期289-295,共7页Computer Engineering
基 金:国家自然科学基金(61972135);黑龙江省自然科学基金(LH2020F043)。
摘 要:现有的多模态检测模型通常对每个模态的特征进行简单拼接,不能对模态之间的相关性进行有效建模,而且很难迁移到标签稀少的领域。提出一种基于预训练和多模态融合的假新闻检测模型PMFD。提取新闻附带图像不同区域的特征作为图像原始向量,合并图像原始向量作为图像引导向量,设计早期融合、中期融合、后期融合3种不同的多模态融合方式。在早期融合阶段,通过图像引导向量初始化文本特征提取器,获取文本原始向量,合并文本原始向量作为文本引导向量。在中期融合阶段,使用模态的原始向量集合与其他模态的引导向量构造模态的特征表示。在后期融合阶段,融合不同模态的特征表示,构造新闻的特征表示。为提高模型的泛化能力,在标签丰富的数据上对PMFD进行预训练,然后再在标签稀少的数据上对PMFD进行微调。在公开数据集上的实验结果表明,PMFD能有效检测假新闻结果,相对传统模型CNN、LSTM、BERT等有10%以上的提升,相对EANN、M_model多模态假新闻检测模型有2%~3%的提升。Existing multi-modal detection models are typically characterized by a simple splicing of features from each modality and are often ineffective in modeling the correlation between modalities.Furthermore,the migration of these models to domains with sparse labels is challenging.In this paper,a PMFD model,based on pre-training and multi-modal fusion,is proposed.Initially,image raw vectors are extracted from different regions of news incidental images,which are then merged to form image guide vectors.Three distinct multimodal fusion methods are designed:early fusion,middle fusion,and post fusion.During early fusion,the text feature extractor is initialized with image bootstrap vectors,leading to the acquisition of text original vectors,which are subsequently merged into text bootstrap vectors.In the middle fusion stage,the feature representation of the modality is constructed using the modality's original vectors combined with the bootstrap vectors of other modalities.For post fusion,the feature representations of different modalities are fused to construct the feature representation of news.To enhance the model's generalization capability,PMFD is initially pre-trained on label-rich data and then fine-tuned on label-sparse data.Experimental results on public data set show that,this approach demonstrates an improvement of over 10%compared to traditional models,including CNN,LSTM,and BERT,and a 2%-3%enhancement over existing EANN,M_model multi-modal fake news detection models.
关 键 词:假新闻检测 预训练 多模态融合 引导向量 跨模态共享特征 阶段融合
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.177