融合多重特征词嵌入的农业实体命名识别研究  被引量:1

Research on Agricultural Entity Naming Recognition Based on Multiple Feature Words Embedding

在线阅读下载全文

作  者:丁浩 孔令圆[1,2] 刘清 胡广伟 Ding Hao;Kong Lingyuan;Liu Qing;Hu Guangwei(School of Information Management,Nanjing University,Nanjing 210023,China;Institute of Government Data Resources,Nanjing University,Nanjing 210023,China)

机构地区:[1]南京大学信息管理学院,江苏南京210023 [2]南京大学政务数据资源研究所,江苏南京210023

出  处:《现代情报》2023年第11期135-145,共11页Journal of Modern Information

基  金:国家社会科学基金重大项目“大数据驱动的城乡社区服务体系精准化构建研究”(项目编号:20&ZD154);江苏省科研与实践创新计划项目“面向网络健康信息时序评论主题挖掘与知识图谱构建研究”(项目编号:KYCX23_0079)。

摘  要:[目的/意义]本文针对农业领域提出一种基于融合多重特征词嵌入模型的农业命名实体识别方法,以提高识别准确度。[方法/过程]通过使用结合字符、位置语义、领域知识字典特征等多重特征向量作为嵌入层,充分考虑字符的位置信息和上下文语义信息,并根据农业领域的中文实体的特点改进了单一字符向量嵌入,获得更多的农业实体特征,同时采用双向长短时记忆网络BiLSTM和多头注意力机制来学习文本的长距离依赖信息,再利用条件随机场CRF获得全局最优标注序列。[结果/结论]本文在农业领域中文实体语料数据集中与9种基于基线方法进行对比实验,模型的Precision为92.2%,Recall为92.0%,F1值为92.11%,均优于其他基线模型,说明本文模型对于中文农业命名实体识别更精确。[Purpose/Significance]This article proposes an agricultural named entity recognition method based on the fusion of multiple feature word embedding models in the agricultural field to improve recognition.[Method/Process]The study used multiple feature vectors such as characters,positional semantics,and domain knowledge dictionary features as embedding layers,fully considered the positional and contextual semantic information of characters,and improved the single character vector embedding based on the characteristics of Chinese entities in the agricultural field,obtained more agricultural entity features.Simultaneously,used the bidirectional long and short term memory network(BiLSTM)and multi head attention mechanism to learn the long-distance dependency information of the text,and then used the conditional random field(CRF)to obtain the global optimal annotation sequence.[Result/Conclusion]This article conducts comparative experiments with 9 baseline based methods on the Chinese entity corpus dataset in the agricultural field.The model s Precision is 92.2%,Recall is 92.0%,and F1 value is 92.11%,all of which are better than other baseline models,indicating that the model proposed in this article is more accurate in recognizing Chinese agricultural name entities.

关 键 词:自然语言处理 命名实体识别 农业文本 信息抽取 BiLSTM CRF 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象