检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨虔懿 喻金平[1] YANG Qianyi;YU Jinping(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)
机构地区:[1]江西理工大学信息工程学院,江西赣州341000
出 处:《软件导刊》2024年第12期82-91,共10页Software Guide
摘 要:中小微企业在国民经济中具有重要地位。近年来,国家推出的多种惠企政策包含政府决策关键信息。然而,政策文本结构复杂、专业语义性依赖较强,且含有噪声文本与嵌套实体,信息提取难度高。为此,提出一种基于多级词汇全局指针与对抗训练的命名实体识别模型。该模型在嵌入层融合LEBERT模型获取字符与词汇的组合语义表示,通过全局指针构建全局实体矩阵,统一处理扁平和嵌套实体;同时引入旋转式位置编码提升对位置信息的感知力,并结合对抗训练增强稳定性和鲁棒性。实验结果表明,该模型的F1值为81.90%,与经典的基于序列标注的模型相比提升了4.72%,整体性能支持下游任务开展。Small and medium-sized enterprises play an important role in the national economy.In recent years,various preferential policies for enterprises introduced by the government have included key information for government decision-making.However,policy texts have complex structures,strong dependence on professional semantics,and contain noisy text and nested entities,making information extraction difficult.Therefore,a named entity recognition model based on multi-level vocabulary global pointers and adversarial training is proposed.This model integrates the LEBERT model at the embedding layer to obtain the combined semantic representation of characters and vocabulary,and constructs a global entity matrix through global pointers to uniformly process flat and nested entities;Simultaneously introducing rotary position encoding to enhance the perception of position information,and combining it with adversarial training to enhance stability and robustness.The experimental results show that the F1 value of the model is 81.90%,which is 4.72%higher than the classical sequence annotation based model.The overall performance supports downstream task development.
关 键 词:命名实体识别 惠企政策 预训练模型 全局指针 对抗训练
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145