检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨竣辉[1] 刘保冰 YANG Jun-hui;LIU Bao-bing(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)
机构地区:[1]江西理工大学信息工程学院,江西赣州341000
出 处:《计算机工程与设计》2024年第12期3712-3718,共7页Computer Engineering and Design
基 金:国家自然科学基金项目(61273328)。
摘 要:针对现有的中文命名实体识别的方法获取中文词级别的特征信息效果不理想且模型易受噪音影响而存在不稳定的问题,提出一种基于词汇增强和对抗训练的中文命名实体识别方法。将输入文本通过词汇增强模块获取到词汇向量,将预训练模型得到的字符级嵌入向量和词汇向量进行字词融合;使用字词融合的嵌入向量通过MOA方式生成对抗样本;使用BiGRU和CRF分别获取语义编码信息并进行解码得到预测结果。实验结果表明,该方法在中文命名实体识别数据集Resume和中药说明书上的F1值分别达到97.14%和73.65%,验证了该模型的有效性。To address the problems that the existing methods for Chinese named entity recognition are not effective in obtaining Chinese word-level feature information and the model is susceptible to noise and unstable,a Chinese named entity recognition method based on vocabulary enhancement and adversarial training was proposed.The input text was obtained from vocabulary vectors through the vocabulary enhancement module,and the character-level embedding vectors obtained from the pre-training model and the word-level embedding were fused to obtain the embedding vectors.The embedding vectors were used to generate the adversarial samples through the MOA method.The semantically coded information was obtained from the BiGRU and the predicted results were obtained from decoding using the CRF,respectively.Experimental results show that the F1 value of the proposed method on the Chinese named entity recognition dataset Resume and the Chinese medicine instruction manual reaches 97.14%and 73.65%respectively,verifying the effectiveness of the model.
关 键 词:中文命名实体识别 词汇增强 预训练模型 字词融合 对抗训练 双向门控循环单元 条件随机场
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.90