基于词聚类CNN和Bi-LSTM的汉语复句关系识别方法  被引量:1

Word Clustering Based Convolutional Neural Network and Bi-LSTM for Relation Recognition in Chinese Compound Sentence

在线阅读下载全文

作  者:孙凯丽 李源[1] 邓沌华[2] 李妙 李洋 SUN Kaili;LI Yuan;DENG Dunhua;LI Miao;LI Yang(School of Computer Science,Central China Normal University,Wuhan 430079;Center for Research on Language and Language Education,Central China Normal University,Wuhan 430079)

机构地区:[1]华中师范大学计算机学院,武汉430079 [2]华中师范大学语言与语言教育研究中心,武汉430079

出  处:《计算机与数字工程》2021年第8期1588-1593,共6页Computer & Digital Engineering

基  金:国家社会科学基金项目(编号:18BYY174);教育部人文社会科学研究规划基金项目(编号:14YJA740020)资助。

摘  要:复句关系识别是对分句间语义关系的甄别,是复句语义分析的关键。由于非充盈态汉语复句存在隐式关系的特点,给语义关系识别造成了困难。因此,为了深度挖掘复句中隐含的语义信息实现正确的关系分类,该文提出了一种基于词聚类的CNN与Bi-LSTM相结合的网络结构BCCNN。其中,通过使用词聚类算法对单词向量建模提取单词间的语义相似特征,并在此基础上使用CNN对复句进行深度建模以获得复句的局部特征。另外,该文将CNN中的池化层替换为Bi-LSTM网络层,在减少池化操作所带来语义信息丢失的同时又得到了全局的长距离语义依赖特征。与其他基于汉语复句语料库(CCCS)和清华汉语树库(TCT)的实验结果对比,论文的方法达到了较好的识别效果,其准确率分别达到了92.4%和90.7%。Compound sentence relation recognition is screening for semantic relation of clauses,and it is the key to analyze semantic relationships of compound sentences.Due to the implicit relation in non-saturate Chinese compound sentences,it is difficult to realize the recognition of relation.To deeply mine the implicit semantic information to achieve correct relation classification,CNN based on word clustering combined with Bi-LSTM is proposed in this paper.And a word clustering algorithm is used to model the word vector to extract the semantic similarity features between words,and based on this,CNN is used to deeply model the compound sentence to obtain the local features.Also,this paper replaces the pooling layer in CNN with the Bi-LSTM network,which reduces the loss of semantic information caused by pooling operation,and at the same time this paper obtains the global long-distance semantic dependency features.Compared with other results,the experiment results on the CCCS and TCT show that the accuracy of this paper reaches 92.4%and 90.7%.

关 键 词:复句关系识别 非充盈态复句 词聚类算法 卷积神经网络(CNN) 双向长短时记忆网络(Bi-LSTM) 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象