检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:丁浩 孔令圆[1,2] 刘清 胡广伟 Ding Hao;Kong Lingyuan;Liu Qing;Hu Guangwei(School of Information Management,Nanjing University,Nanjing 210023,China;Institute of Government Data Resources,Nanjing University,Nanjing 210023,China)
机构地区:[1]南京大学信息管理学院,江苏南京210023 [2]南京大学政务数据资源研究所,江苏南京210023
出 处:《现代情报》2023年第11期135-145,共11页Journal of Modern Information
基 金:国家社会科学基金重大项目“大数据驱动的城乡社区服务体系精准化构建研究”(项目编号:20&ZD154);江苏省科研与实践创新计划项目“面向网络健康信息时序评论主题挖掘与知识图谱构建研究”(项目编号:KYCX23_0079)。
摘 要:[目的/意义]本文针对农业领域提出一种基于融合多重特征词嵌入模型的农业命名实体识别方法,以提高识别准确度。[方法/过程]通过使用结合字符、位置语义、领域知识字典特征等多重特征向量作为嵌入层,充分考虑字符的位置信息和上下文语义信息,并根据农业领域的中文实体的特点改进了单一字符向量嵌入,获得更多的农业实体特征,同时采用双向长短时记忆网络BiLSTM和多头注意力机制来学习文本的长距离依赖信息,再利用条件随机场CRF获得全局最优标注序列。[结果/结论]本文在农业领域中文实体语料数据集中与9种基于基线方法进行对比实验,模型的Precision为92.2%,Recall为92.0%,F1值为92.11%,均优于其他基线模型,说明本文模型对于中文农业命名实体识别更精确。[Purpose/Significance]This article proposes an agricultural named entity recognition method based on the fusion of multiple feature word embedding models in the agricultural field to improve recognition.[Method/Process]The study used multiple feature vectors such as characters,positional semantics,and domain knowledge dictionary features as embedding layers,fully considered the positional and contextual semantic information of characters,and improved the single character vector embedding based on the characteristics of Chinese entities in the agricultural field,obtained more agricultural entity features.Simultaneously,used the bidirectional long and short term memory network(BiLSTM)and multi head attention mechanism to learn the long-distance dependency information of the text,and then used the conditional random field(CRF)to obtain the global optimal annotation sequence.[Result/Conclusion]This article conducts comparative experiments with 9 baseline based methods on the Chinese entity corpus dataset in the agricultural field.The model s Precision is 92.2%,Recall is 92.0%,and F1 value is 92.11%,all of which are better than other baseline models,indicating that the model proposed in this article is more accurate in recognizing Chinese agricultural name entities.
关 键 词:自然语言处理 命名实体识别 农业文本 信息抽取 BiLSTM CRF
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.177