基于特征融合的中文新闻文本情感分类方法研究  被引量:3

Research on Chinese news text sentiment classification method based on feature fusion

在线阅读下载全文

作  者:冯宇航 邵剑飞[1] 张小为 邵建龙[1] FENG Yuhang;SHAO Jianfei;ZHANG Xiaowei;SHAO Jianlong(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500

出  处:《现代电子技术》2023年第3期62-68,共7页Modern Electronics Technique

基  金:国家自然科学基金项目(61732005)。

摘  要:针对现有的新闻文本情感分析任务中,单一模型提取文本特征的片面性,且无法充分提取新闻文本语义等特征问题,提出一种基于门控单元特征融合的BERT-CNN情感分析方法。该方法分别采用BERT语言预训练模型与卷积神经网络(CNN)对新闻文本的特征向量进行提取;然后采用门控循环单元对提取到的文本特征进行特征融合;再输入到Softmax层进行新闻文本分类;最后从精准率、召回率和F_(1)-Score三个维度对比BERT、BERT-CNN、BERT-DPCNN和BERT-ERNIE的实验结果。实验结果表明,当分类场景更换为情感识别时,BERT-CNN依旧具有强大的语义捕捉能力,证明了BERT-CNN的泛化能力;另外,从原BERT的提升效果看,基于门控单元特征融合的BERT-CNN方法(提升2.07%)比词向量的方法(提升0.31%)更高。这一结果也证明了基于门控单元特征融合的BERT-CNN方法的有效性。A BERT-CNN sentiment analysis method based on gated unit feature fusion is proposed in this paper to deal with the one-sidedness that a single model is used to extract text features and the problem that the features such as semantic of news text cannot be fully extracted in the existing news text sentiment analysis tasks.In the method,BERT language pre-training model and CNN are adopted to extract the feature vector of news text,and then the gated recurrent unit is used to fuse the extracted text features,and input them to Softmax layer for news text classification.The experimental results of BERT,BERTCNN,BERT-DPCNN and BERT-ERNIE are compared in three aspects of precision rate,recall rate and F_(1)-Score.The comparison results show that when the classification scene is replaced with emotion recognition,BERT-CNN still has strong semantic capture ability,which proves the generalization ability of BERT-CNN;in addition,proceeding from the improving effect of the original BERT,the improving effect of BERT-CNN method based on gated unit feature fusion is 2.07%,which is higher than that of word vector method(0.31%improvement).This results also prove the effectiveness of the BERT-CNN method based on the gated unit feature fusion.

关 键 词:情感分析 文本特征提取 特征融合 文本分类 情感识别 语义捕捉 

分 类 号:TN911-34[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象