检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张文韩 刘小明[1,4] 杨关 刘杰[3,4] Zhang Wenhan;Liu Xiaoming;Yang Guan;Liu Jie(School of Computer Science,Zhongyuan University of Technology,Zhengzhou 450007;Henan provincial Key Laboratory on Public Opinion Intelligent Analysis(Zhongyuan University of Technology),Zhengzhou 450007;School of Information Science,North China University of Technology,Beijing 100144;China Language Intelligence Research Center of the National Language Commission(Capital Normal University),Beijing 102206)
机构地区:[1]中原工学院计算机学院,郑州450007 [2]河南省网络舆情监测与智能分析重点实验室(中原工学院),郑州450007 [3]北方工业大学信息学院,北京100144 [4]国家语委中国语言智能研究中心(首都师范大学),北京102206
出 处:《计算机研究与发展》2023年第12期2864-2876,共13页Journal of Computer Research and Development
基 金:国家重点研发计划项目(2020AAA0109700);国家自然科学基金项目(62076167);河南省高等学校重点科研项目(23A520022)。
摘 要:跨域命名实体识别旨在缓解目标领域标注数据不足的问题.现有方法通常利用特征表示或者模型参数的共享来实现实体识别能力的跨领域迁移,但对文本序列中结构化知识的充分利用仍有所欠缺.基于此,提出了基于多层结构化语义知识增强的跨领域命名实体识别(multi-level structured semantic knowledge enhanced cross-domain named entity recognition,MSKE-CDNER)模型,即通过在多个层级实现对源领域和目标领域文本各自蕴含的结构化表示的对齐来促进实体识别能力跨领域迁移.首先,MSKE-CDNER利用结构特征表示层从不同领域中获取文本的结构化语义知识表示;然后,将获得的结构化语义知识表示通过潜层对齐模块在对应的层级进行结构化对齐,获取结构化的跨领域不变知识,从而提高模型对文本结构化知识的利用;此外,将域不变知识与特定域知识融合,进一步增强模型的泛化能力;最后,分别在5个英文数据集和特定的跨域命名实体识别数据集上进行实验.结果显示,对比当前跨域模型,MSKE-CDNER的平均性能提高了0.43%和1.47%,表明利用特征表示中的结构化知识可以有效提高目标领域的实体识别能力.Cross-domain named entity recognition aims to alleviate the problem of insufficient annotation data in the target domain.Most existing methods,which exploit the feature representation or model parameter sharing to achieve cross-domain transfer of entity recognition capabilities and can only partially utilize structured knowledge entailed in text sequences.To address this,we propose a multi-level structured semantic knowledge enhanced cross-domain named entity recognition MSKE-CDNER,which could facilitate the transfer of entity recognition capabilities by aligning the structured knowledge representations embedded in the source and target domains from multiple levels.First,MSKE-CDNER uses the structural feature representation layer to achieve structured semantic knowledge representations of texts from different fields’structured alignment.And then,these structured semantic representations are aligned at the corresponding layers by a latent alignment module to obtain cross-domain invariant knowledge.Finally,this cross-domain consistent structured knowledge is fused with domain-specific knowledge to enhance the generalization capability of the model.Experiments on five datasets and a specific cross-domain named entity recognition dataset have shown that the average performance of MSKE-CDNER improved by 0.43%and 1.47%compared with the current models.All of these indicate that exploiting text sequences’structured semantic knowledge representation could effectively enhance entity recognition in the target domain.
关 键 词:跨域命名实体识别 跨领域迁移 结构化对齐 结构化知识 域不变知识
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229