基于双向LSTM和GBDT的中医文本关系抽取模型  被引量:12

TCM text relationship extraction model based on bidirectional LSTM and GBDT

在线阅读下载全文

作  者:罗计根 杜建强[1] 聂斌[1] 熊旺平[1] 刘蕾[1] 贺佳[1] Luo Jigen;Du Jianqiang;Nie Bin;Xiong Wangping;Liu Lei;He Jia(School of Computer,Jiangxi University of Traditional Chinese Medicine,Nanchang 330004,China)

机构地区:[1]江西中医药大学计算机学院

出  处:《计算机应用研究》2019年第12期3744-3747,共4页Application Research of Computers

基  金:国家自然科学基金资助项目(61363042,61562045,61762051);江西省科技厅重大研发计划资助项目(20171ACE50021);江西省科技厅重点研发计划资助项目(20171BBG70108);江西省研究生创新专项资金资助项目(YC2017-S349)

摘  要:为解决采用softmax作为长短期记忆网络分类器导致实体关系识别模型泛化能力不足,不能较好适用中医实体关系抽取等问题,提出一种融合梯度提升树的双向长短期记忆网络的关系识别算法(BILSTM-GBDT)。先采用word2vec对中医文本进行向量化表示,再利用基于注意力机制的双向长短期记忆网络提取高阶特征,最后采用集成分类模型梯度提升树作为特征分类器,提高关系识别效果。在中医等多个关系语料库上的实验结果表明,该模型与传统SVM方法、GBDT方法及其深度学习方法相比,均有更高的精确率、召回率和F值。In order to solve the problem that the use of softmax as a long-short-term memory network classifier leads to the lack of generalization ability of the entity relationship recognition model,it is not suitable for the extraction of TCM entity relationships. This paper proposed a bidirectional long short-term memory( BILSTM) relational identification algorithm( BILSTM-GBDT) that incorporates a gradient boosting decision tree( GBDT). Firstly,it trained the Chinese medicine text vector by word2 vec,then extracted the high-order features by the bidirectional long short-term memory network based on the attention mechanism. Finally,it used the integrated classification model gradient lifting tree as the feature classifier to improve the relationship recognition effect. Experimental results on multiple relational corpora such as Chinese medicine show that the model has higher accuracy,recall and F value than traditional SVM method,GBDT method and deep learning method.

关 键 词:关系抽取 长短期记忆网络 梯度提升树 注意力机制 中医文本 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象