基于双通道特征融合与对抗训练的短文本分类  

Short Text Classification Based on Dual-Channel Feature Fusion and Adversarial Training

在线阅读下载全文

作  者:喻金平[1] 姚炫辰 YU Jinping;YAO Xuanchen(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341400,China)

机构地区:[1]江西理工大学信息工程学院,江西赣州341400

出  处:《软件导刊》2025年第2期56-61,共6页Software Guide

基  金:中央引导地方科技发展专项资金项目(20201ZDI03003)。

摘  要:针对短文本语言的稀疏性导致语义分析困难的问题,提出一种结合双通道特征融合和对抗训练的短文本分类模型。首先,采用ChineseBERT进行词嵌入表示,解决中文短文本词汇稀疏的挑战;其次,引入FGM对抗训练技术以增强整体模型的鲁棒性和泛化能力;再次,通过双通道DPCNN和BiGRU进行特征提取以丰富语义信息,使模型能更好地理解短文本的含义。为了充分获取并融合不同来源的特征信息,引入多头注意力机制对特征进行融合,以提高模型性能。在THUCNews和今日头条两个数据集上的测试结果表明,该模型准确率、召回率和F1值相较于传统模型均有一定提高,证明了其在解决短文本分类问题上的有效性和可行性,为解决短文本分类的实际问题提供了有效工具。Aiming at the problem that the sparseness of short text languages leads to difficulties in semantic analysis,a method combining two-channel feature fusion and adversarial training is proposed for short text classification.First,ChineseBERT is used for word embedding representation to address the challenge of sparse vocabulary in Chinese short text,followed by the introduction of FGM adversarial training technique to enhance the robustness and generalization ability of the overall model.Then,the semantic information is enriched by two-channel DPCNN and BiGRU for feature extraction,so that the model can better understand the meaning of the short text.In order to fully acquire and fuse feature information from different sources,a multi-attention mechanism is introduced to fuse the features as a way to improve the performance of the model.The model proposed in this paper is tested on two dataset,THUCNews and Today’s Headlines,and shows an improvement in accuracy,recall rate and F1 value compared with the traditional model,proving its effectiveness and feasibility in solving the problem of short text categorization,and providing an effective tool for solving the practical problem of short text categorization.

关 键 词:ChineseBERT DPCNN BiGRU 多头注意力机制 特征融合 对抗训练 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象