基于去噪字词联合模型的中文命名实体识别  被引量:5

Chinese Named Entity Recognition Based on Denoising Joint Character-Word Model

在线阅读下载全文

作  者:杨倩 顾磊[1] YANG Qian;GU Lei(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)

机构地区:[1]南京邮电大学计算机学院,南京210023

出  处:《计算机工程与应用》2021年第7期151-157,共7页Computer Engineering and Applications

基  金:教育部人文社会科学研究青年基金(18YJC870006);国家自然科学基金(61302157)。

摘  要:中文命名实体识别是中文信息处理领域中的一项基本任务,能够为关系抽取、实体链接和知识图谱提供技术支持。与传统命名实体识别方法相比,基于双向长短期记忆(BiLSTM)神经网络模型在中文命名实体识别任务中获得了较好的效果。针对基于字词联合的BiLSTM-CRF模型存在特征提取不够准确的缺陷,在其基础上,引入Gated去噪机制,对输入字向量进行微调,自动学习过滤或者减少文本中不重要的字信息,保留对命名实体识别任务更有用的信息,进而提高命名实体的识别率。在Resume和Weibo数据集上的测试结果表明,该方法有效地提高了中文命名实体识别的效果。Chinese Named Entity Recognition(NER)is a basic task in the field of Chinese information processing,which can provide technical support for relation extraction,entity linking and knowledge graph.Compared with the traditional namedentity recognition methods,the model based on Bidirectional Long Short-Term Memory(BiLSTM)neural network has achieved good results in the task of Chinese NER.A Gated denoising mechanism is introduced to reduce the defect of BiLSTM-CRF model based on joint character-word learning,such as inaccurate feature extraction.The mechanism can fine tune the input character vector,automatically learn to filter or reduce the unimportant character information in the text,and retain more useful information for Chinese NER,so as to improve the recognition rate of the named entity.The test results on Resume and Weibo datasets show that this method effectively improves the results of Chinese NER.

关 键 词:字词联合 去噪机制 长短期记忆网络 中文命名实体识别 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象