检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨春明[1,3] 魏成志 张晖 赵旭剑 李波[1] YANG Chunming;WEI Chenzhi;ZHANG Hui;ZHAO Xujian;LI Bo(School of Computer Science and Technology,Southwest University of Science and Technology,Mianyang 621010,Sichuan,China;School of Science,Southwest University of Science and Technology,Mianyang 621010,Sichuan,China;Sichuan Big Data and Intelligent System Engineering Technology Research Center,Mianyang 621010,Sichuan,China)
机构地区:[1]西南科技大学计算机科学与技术学院,四川绵阳621010 [2]西南科技大学理学院,四川绵阳621010 [3]四川省大数据与智能系统工程技术研究中心,四川绵阳621010
出 处:《西南科技大学学报》2020年第3期86-91,共6页Journal of Southwest University of Science and Technology
基 金:教育部人文社科基金(17YJCZH260);赛尔网络下一代创新项目(NGII20170901,NGII20180403)。
摘 要:政务领域的命名实体通常是一些政务事项名,这类实体与开放域实体比较,具有长度较长、实体并列、别称等特点,目前还未见公开可用的训练数据集。构建了具有25176个句子的政务领域命名实体识别数据集,并提出一种基于BERT-BLSTM-CRF的神经网络识别模型,该模型在不依赖人工特征选择的情况下,使用BERT中文预训练模型,然后采用BLSTM-CRF识别实体。实验结果表明,该模型识别效果优于CRF,BLSTM-CRF,CNN-BLSTM-CRF,F1值达到92.23%。The named entities in the government affairs are some service items,and they have the characteristics of long length,entity juxtaposition,abbreviations,nicknames,etc.At present,there is no publicly available training data set.In this paper,a government domain named entity recognition data set with 25176 sentences was constructed,and a neural network method based on BERT-BLSTM-CRF was proposed.In this model,BERT Chinese pre-training model was used without relying on the selection of artificial features,and then BLSTM-CRF was used for named entity recognition.The experimental results show that the recognition accuracy is better than that of CRF,BLSTM-CRF,CNN-BLSTM-CRF,and the F1 value reaches 92.23%.
关 键 词:政务事务 命名实体识别 BLSMT CRF BERT
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15