融合依存信息的关系导向型实体关系抽取方法  被引量:4

Relationship-oriented entity relationship extraction method combining dependent information

在线阅读下载全文

作  者:王景慧 卢玲[1] 段志丽 张亮 王玉柯 Wang Jinghui;Lu Ling;Duan Zhili;Zhang Liang;Wang Yuke(College of Computer Science&Engineering,Chongqing University of Technology,Chongqing 400050,China)

机构地区:[1]重庆理工大学计算机科学与工程学院,重庆400050

出  处:《计算机应用研究》2023年第5期1410-1415,1440,共7页Application Research of Computers

基  金:国家社会科学基金西部项目(2017CG29);重庆市教育科学规划课题资助项目(2021CJG05);重庆理工大学研究生教育高质量发展行动计划资助项目(gzlcx20223201)。

摘  要:中文实体关系抽取多以字符序列处理文本,存在字符语义表征不足、长字符序列语义遗忘等问题,制约了远距离实体的召回率,为此提出了一种融合依存句法信息的关系导向型抽取方法。输入层以字符序列和基于同义词表示的词序列为输入;编码端用长短时记忆网络(LSTM)进行文本编码,并加入全局依存信息,用于产生关系门的表示;解码端加入依存类型信息,并在关系门的作用下,用双向长短时记忆网络(BiLSTM)解码得到实体关系三元组。该方法在SanWen、FinRE、DuIE、IPRE中文数据集上的F1值分别较基线方法提高5.84%、2.11%、2.69%和0.39%。消融实验表明,提出的全局依存信息和依存类型信息表示方法均可提升抽取性能,对长句和远距离实体的抽取性能也稳定地优于基线方法。Most Chinese entity relationship extraction methods represent text with character sequences,which suffer from insufficient semantic representation of characters and semantic forgetting of long character sequences,thus limiting the recall of remote entities.Therefore,this paper proposed relationship-oriented extraction method incorporating dependent syntactic information.The method gave character sequences and word sequences based on synonym representation as inputs to the input layer.At the encoding end,it used LSTM for text coding,and added global dependency information to generate the representation of relation gates.The decoding terminal added dependency type information,and under the function of relation gate,it decoded the entity relation triplet by bidirectional long short memory network(BiLSTM).The F 1 values of this method on SanWen,FinRE,DuIE and IPRE Chinese datasets were 5.84%,2.11%,2.69%and 0.39%higher than those of the baseline methods,respectively.The ablation studies show that both global dependency information and dependency type information contribute to performance improvement,while extraction performance for long sentences and remote entities is also consis-tently outperforming the baseline approaches.

关 键 词:实体关系抽取 依存句法分析 剪枝 关系导向 同义词 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象