检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:关景文 宋晓 李晓庆 杨彤 周军华 Guan Jingwen;Song Xiao;Li Xiaoqing;Yang Tong;Zhou Junhua(School of Automation Science and Electrical Engineering,Beihang University,Beijing 100191,China;School of Cyber Science and Technology,Beihang University,Beijing 100191,China;Beijing Simulation Center,Beijing 100854,China)
机构地区:[1]北京航空航天大学自动化学院,北京100191 [2]北京航空航天大学网络空间安全学院,北京100191 [3]北京仿真中心,北京100854
出 处:《系统仿真学报》2023年第8期1757-1767,共11页Journal of System Simulation
基 金:国家重点研发计划(2020YFB1712203)。
摘 要:常规领域文本识别相对容易,而专业术语存在大量嵌套命名实体,识别难度大,是构建航空航天领域知识图谱的核心挑战之一。现有的命名实体识别技术多采用双向长短记忆网络加条件随机场(BiLSTM-CRF)识别实体,很难区分导弹领域术语的嵌套、交叉等复杂关系。为解决这一难题,在对领域文本进行嵌套实体标注的基础上,提出一种融合语言学特征、基于机器阅读理解的嵌套命名实体识别方法,引入先验知识、改变解码方式,以问答形式进行多任务预测。实验表明:所提方法能有效提高导弹领域文本嵌套实体识别的准确率和召回率,其综合指标F1值相较于基于BiLSTM-CRF的嵌套命名实体识别方法提高了13.89%。Compared with the text recognition in conventional fields,it is difficult to recognize the large number of nested named entities in professional terms.This is also one of the care challenges in building the knowledge graph in aerospace field.For the named entity recognition technologies,bidirectional long short-term memory network plus conditional random field(BiLSTM-CRF) is often used to identify entities,which is difficult to distinguish the complex relationships such as nesting and intersection of terms in missile field.In order to solve the problem,based on the nested entity labeling of domain text,a nested named entity recognition method based on linguistic features and machine reading comprehension is proposed,in which prior knowledge is introduced,decoding method is changed,and multi-task predictions are carried out in the form of question and answer.Experiments show that the proposed method can greatly improve the accuracy and recall rate of text nested entity recognition in missile field,in which the comprehensive index F1 value is improved by 13.89% compared with the nested named entity recognition method based on BiLSTM-CRF.
关 键 词:导弹 嵌套命名实体识别 知识抽取 机器阅读理解 语言学特征
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49