检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈钰佳 郑更生[1,2] 肖伟 CHEN Yu-jia;ZHENG Geng-sheng;XIAO Wei(School of Computer Science and Engineering,Wuhan Institute of Technology,Wuhan 430205,China;Key Laboratory of Intelligent Robot in Hubei Province,Wuhan Institute of Technology,Wuhan 430205,China)
机构地区:[1]武汉工程大学计算机科学与工程学院,武汉430205 [2]武汉工程大学智能机器人湖北省重点实验室,武汉430205
出 处:《科学技术与工程》2023年第18期7844-7851,共8页Science Technology and Engineering
基 金:国家自然科学基金青年基金(62106179)。
摘 要:细粒度情感分析是自然语言处理的关键任务之一,针对现有的解决中文影评情感分析的主流方案一般使用Word2Vector等预训练模型生成静态词向量,不能很好地解决一词多义问题,并且采用CNN池化的方式提取文本特征可能造成文本信息损失造成学习不充分,同时未能利用文本中包含的长距离依赖信息和句子中的句法信息。因此,提出了一种新的情感分析模型RoBERTa-PWCN-GTRU。模型使用RoBERTa预训练模型生成动态文本词向量,解决一词多义问题。为充分提取利用文本信息,采用改进的网络DenseDPCNN捕获文本长距离依赖信息,并与Bi-LSTM获取到的全局语义信息以双通道的方式进行特征融合,再融入邻近加权卷积网络(proximity-weighted convolutional network,PWCN)获取到的句子句法信息,并引入门控Tanh-Relu单元(gated Tanh-Relu unit,GTRU)进行进一步的特征筛选。在构建的中文影评数据集上的实验结果表明,提出的情感分析模型较主流模型在性能上有明显提升,其在中文影评数据集上的准确率达89.67%,F 1达82.51%,通过消融实验进一步验证了模型性能的有效性。模型能够为制片方未来的电影制作和消费者的购票决策提供有用信息,具有一定的实用价值。Fine-grained sentiment analysis is one of the key tasks in natural language processing.The existing mainstream solutions for sentiment analysis of Chinese film reviews generally use pre-training models such as Word2Vector to generate static word vectors,which can not solve the polysemy problem well.In addition,the use of CNN pooling to extract text features may lead to the loss of text information,resulting in insufficient learning,and fail to use the long-distance dependent information contained in the text and the syntactic information in the sentence.Therefore,a new sentiment analysis model RoBERTa-PWCN-GTRU was proposed.The model used RoBERTa pre-training model to generate dynamic text word vectors to solve the polysemy problem.In order to fully extract and utilize the text information,the improved network DenseDPCNN was used to capture the long-distance dependent information of the text,and the feature fusion was carried out with the global semantic information obtained by Bi-LSTM in a dual-channel way,and then the sentence syntax information obtained by the proximity-weighted convolutional network(PWCN)was integrated.Gated Tanh-Relu unit(GTRU)was introduced for further feature screening.The experimental results on the constructed Chinese film review dataset show that the proposed sentiment analysis model has significantly improved performance compared with the mainstream model,with the accuracy of 89.67%and F 1 value of 82.51%on the Chinese film review dataset.The ablation experiment further verifies the effectiveness of the model performance.The model can provide useful information for the producers'future film production and consumers'decision of buying tickets,and has certain practical value.
关 键 词:中文影评 情感分析 RoBERTa预训练模型 邻近加权卷积 门控Tanh-Relu单元
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145