基于词融合与跨度检测的中文嵌套命名实体识别  被引量:2

Chinese nested named entity recognition based on vocabulary fusion and span detection

在线阅读下载全文

作  者:陈淑振 窦全胜[1,2,3] 唐焕玲 姜平[1] Chen Shuzhen;Dou Quansheng;Tang Huanling;Jiang Ping(School of Computer Science&Technology,Shandong Technology&Business University,Yantai Shandong 264000,China;Shandong Future Intelligent Computing Collaborative Innovation Center,Yantai Shandong 264000,China;Key Laboratory of Intelligent Information Processing in Universities of Shandong,Yantai Shandong 264000,China)

机构地区:[1]山东工商学院计算机科学与技术学院,山东烟台264000 [2]山东省高等学校未来智能计算协同创新中心,山东烟台264000 [3]山东省高校智能信息处理重点实验室,山东烟台264000

出  处:《计算机应用研究》2023年第8期2382-2386,2392,共6页Application Research of Computers

摘  要:目前中文命名实体识别模型在识别具有嵌套结构的实体时存在误差,无法准确识别。基于跨度的方法能够找出嵌套实体,但在识别过程中经常生成不包含实体的跨度,无法明确划分跨度边界,增加模型负担。针对此问题,提出了基于词汇融合与跨度边界检测的中文嵌套命名实体识别模型。该模型使用多词融合方法达到文本特征增强的目的,在设计的注入模块中将目标语句中字符相关的多个词汇信息进行合并,之后融入到BERT中,以此获得更全面的上下文信息,提供更好的跨度表示;其次添加跨度边界检测模块,通过感知分类器预测跨度的首尾字符来划分跨度边界。在公共数据集上的实验表明,该模型可有效提升识别准确率。At present,Chinese named entity recognition model has errors in recognizing entities with nested structures,so it can not be recognized accurately.The method based on span can find nested entities,but when detecting text,it often gene-rates span without entities,and can not clearly define the span boundary,which increases the model burden.To solve this problem,this paper proposed a Chinese nested named entity recognition model based on vocabulary fusion and span boundary detection.The model used multi-word fusion method to enhance text features,and it merged multiple lexical information rela-ted to the characters in the target sentence in the designed injection module.Ant then it integrated into BERT to obtain more comprehensive context information,and provided better span representation.Secondly,it added a span boundary detection method to divide the span boundary by predicting the first and last characters of the span by perceptual classifier.Experiments on public data sets show that the model can effectively improve the recognition accuracy.

关 键 词:中文嵌套命名实体识别 BERT模型 多词融合 跨度边界检测 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象