结合上下文和依存句法信息的中文短文本情感分析  被引量:8

Sentiment Analysis of Chinese Short Text Combining Context and Dependent Syntactic Information

在线阅读下载全文

作  者:杜启明 李男[1,2] 刘文甫 杨舒丹 岳峰[1,2] DU Qiming;LI Nan;LIU Wenfu;YANG Shudan;YUE Feng(School of Cyberspace Security Academy,Information Engineering University,Zhengzhou 450000,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Information Engineering University,Zhengzhou 450000,China;State Key Laboratory of Complex Electromagnetic Environment Effect on Electronic and Information System,Luoyang,Henan 471003,China)

机构地区:[1]信息工程大学网络空间安全学院,郑州450000 [2]信息工程大学数学工程与先进计算国家重点实验室,郑州450000 [3]电子信息系统复杂电磁环境效应重点实验室,河南洛阳471003

出  处:《计算机科学》2023年第3期307-314,共8页Computer Science

基  金:国家自然科学基金(61802433)。

摘  要:依存句法分析旨在从语言学的角度分析句子的句法结构。现有的研究表明,将这种类似于图结构的数据与图卷积神经网络(Graph Convolutional Network,GCN)进行结合,有助于模型更好地理解文本语义。然而,这些工作在将依存句法信息处理为邻接矩阵时,均忽略了句法依赖标签类型,同时也未考虑与依赖标签相关的单词语义,导致模型无法捕捉到文本中的深层情感特征。针对以上问题,提出了一种结合上下文和依存句法信息的中文短文本情感分析模型(Context and Dependency Syntactic Information,CDSI)。该模型不仅利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)提取文本的上下文语义,而且引入了一种基于依存关系感知的嵌入表示方法,以针对句法结构挖掘不同依赖路径对情感分类任务的贡献权重,然后利用GCN针对上下文和依存句法信息同时建模,以加强文本表示中的情感特征。基于SWB,NLPCC2014和SMP2020-EWEC数据集进行验证,实验表明CDSI模型能够有效融合语句中的语义以及句法结构信息,在中文短文本情感二分类以及多分类中均取得了较好的效果。Dependency parsing aims to analyze the syntactic structure of sentences from the perspective of linguistics.Existing studies suggest that combining such graph-like data with graph convolutional network(GCN)can help model better understand the text semantics.However,when dealing with dependency syntactic information as adjacency matrix,these methods ignore the types of syntactic dependency tags and the word semantics related to the tags,which makes the model unable to capture the deep emotional features.To solve the preceding problem,this paper proposes a Chinese short text sentiment analysis model CDSI(context and dependency syntactic information).This model can use BiLSTM(bidirectional long short-term memory)network to extract the context semantics of the text.Moreover,a dependency-aware embedding representation method is introduced to mine the contribution weights of different dependent paths to the sentiment classification task based on the syntactic structure.Then the GCN is used to model the context and dependent syntactic information at the same time,so as to strengthen the emotional features in the text representation.Based on SWB,NLPCC2014 and SMP2020-EWEC datasets,experimental results show that CDSI can effectively integrate the semantic and structural information in sentences,which achieves good results in both the Chinese short text sentiment binary classification and multi-classification tasks.

关 键 词:句法结构 上下文信息 GCN 中文短文本 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象