基于改进全局指针的惠企政策命名实体识别方法  

Improved Global Pointer Based Named Entity Recognition Method for Enterprise-benefiting Policies

在线阅读下载全文

作  者:杨虔懿 喻金平[1] YANG Qianyi;YU Jinping(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)

机构地区:[1]江西理工大学信息工程学院,江西赣州341000

出  处:《软件导刊》2024年第12期82-91,共10页Software Guide

摘  要:中小微企业在国民经济中具有重要地位。近年来,国家推出的多种惠企政策包含政府决策关键信息。然而,政策文本结构复杂、专业语义性依赖较强,且含有噪声文本与嵌套实体,信息提取难度高。为此,提出一种基于多级词汇全局指针与对抗训练的命名实体识别模型。该模型在嵌入层融合LEBERT模型获取字符与词汇的组合语义表示,通过全局指针构建全局实体矩阵,统一处理扁平和嵌套实体;同时引入旋转式位置编码提升对位置信息的感知力,并结合对抗训练增强稳定性和鲁棒性。实验结果表明,该模型的F1值为81.90%,与经典的基于序列标注的模型相比提升了4.72%,整体性能支持下游任务开展。Small and medium-sized enterprises play an important role in the national economy.In recent years,various preferential policies for enterprises introduced by the government have included key information for government decision-making.However,policy texts have complex structures,strong dependence on professional semantics,and contain noisy text and nested entities,making information extraction difficult.Therefore,a named entity recognition model based on multi-level vocabulary global pointers and adversarial training is proposed.This model integrates the LEBERT model at the embedding layer to obtain the combined semantic representation of characters and vocabulary,and constructs a global entity matrix through global pointers to uniformly process flat and nested entities;Simultaneously introducing rotary position encoding to enhance the perception of position information,and combining it with adversarial training to enhance stability and robustness.The experimental results show that the F1 value of the model is 81.90%,which is 4.72%higher than the classical sequence annotation based model.The overall performance supports downstream task development.

关 键 词:命名实体识别 惠企政策 预训练模型 全局指针 对抗训练 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象