基于政策法规数据的嵌套命名实体识别研究  

Research of nested named entity recognition based on policy and regulatory data

在线阅读下载全文

作  者:徐晗 梁曌 梁小林 XU Han;LIANG Zhao;LIANG Xiaoin(School of Mathematics and Statistics Science,Changsha University of Science and Technology,Changsha 410114,China)

机构地区:[1]长沙理工大学数学与统计学院,湖南长沙410114

出  处:《湖南文理学院学报(自然科学版)》2024年第3期19-23,29,共6页Journal of Hunan University of Arts and Science(Science and Technology)

基  金:国家自然科学基金重点项目(51839002);湖南省自然科学基金资助项目(2021JJ30734);湖南省研究生创新性课题(CX20220952)。

摘  要:针对政策法规文本数据中常常出现的嵌套实体问题进行了分析,构建了一个融合双仿射变换的指针网络模型。该模型使用指针网络替代传统的条件随机场模型来降低计算复杂度,配合双仿射变换模块解决嵌套实体问题,同时定义了一个新的损失函数解决命名实体的稀疏性问题。实验结果表明,该模型改善了传统组合模型在政策法规数据上出现的过拟合及实际预测效果欠佳的问题,在自建政策法规数据中取得了较好的结果,F1得分达到了78.41%,相对传统方法提升明显。The nested entity problem that often occurs in policy and regulatory data texts is analyzed,and a pointer network model that combines biaffine transformations is set up.The computational complexity is reduced by using a pointer network to replace the traditional conditional random field model,and the nested entity problem is solved by a biaffine transformation module.In addition,a new loss function is defined to solve the sparsity problem of named entities.The experimental results show that this model improves the overfitting and poor actual prediction performance of traditional combination models in policy and regulatory data.Good results are achieved in self-built policy and regulatory data,with an F1 score of 78.41%,which is significantly improved compared to traditional methods.

关 键 词:命名实体识别 政策文本挖掘 嵌套实体 自然语言处理 Bert-BiLSTM-Biaffine-Span模型 

分 类 号:O213[理学—概率论与数理统计] TP181[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象