检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王卫红[1] 吕红燕 曹玉辉[1] 霍峥[1] WANG Wei-hong;LYU Hong-yan;CAO Yu-hui;HUO Zheng(School of Information Technology,Hebei University of Economics and Business,Shijiazhuang 050061,China)
机构地区:[1]河北经贸大学信息技术学院,河北石家庄050061
出 处:《计算机技术与发展》2021年第8期100-105,共6页Computer Technology and Development
基 金:国家自然基金项目(62002098);河北省自然科学基金(F2020207001);河北经贸大学科学研究与发展计划基金项目(2021ZD03)。
摘 要:针对命名实体识别方法中语义分析不足及准确率较低的问题,提出一种基于BERT模型的混合神经网络实体识别方法。对命名实体识别研究现状进行了调查与分析,发现现有命名实体识别研究中存在数据分析与特征提取不充分导致准确率较低的问题。利用BERT预训练语言模型动态生成字的语义向量,丰富其文本特征。使用卷积神经网络(convolutional neural network,CNN)模型再次抽取语义特征,实现语义的自动抽取,二者联合作为下一步的输入向量。采用引入注意力机制的双向长短时记忆网络(bi-directional long short-term memory,BiLSTM)获取单个字在字符级别上前后两个方向上的信息。通过条件随机场(conditional random field,CRF)模型解码序列标签,得到全局最优标注序列。在《人民日报》和MSRA两个数据集上的实验结果表明,该方法相比于其他模型,能有效地获取语义信息,在准确率、召回率和F1值上均有所提升。Aiming at the problem of insufficient semantic analysis and low accuracy in named entity recognition method,a hybrid neural network entity recognition method based on BERT model is proposed.The research status of named entity recognition was investigated and analyzed,and it was found that the problem of low accuracy resulted from insufficient data analysis and feature extraction existed in the research of named entity recognition.The semantic vector of the word is generated dynamically by using BERT pre-training language model to enrich its text features.The semantic features are extracted again using the convolutional neural network(CNN)model to realize the automatic semantic extraction,and the two are combined as the next step of the input vector.BiLSTM is used to obtain the information of a single word in two directions before and after the character level.The conditional random field(CRF)model was used to decode the sequence tags and obtain the global optimal labeling sequence.Experiments on two data sets of People's Daily and MSRA show that compared with other models,the proposed method can effectively obtain semantic information,and it is improved in accuracy,recall rate and F1 value.
关 键 词:命名实体识别 BERT模型 卷积神经网络 双向长短期记忆网络 条件随机场
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.138.120.156