Chinese Named Entity Recognition Method for Musk Deer Domain Based on Cross-Attention Enhanced Lexicon Features  

在线阅读下载全文

作  者:Yumei Hao Haiyan Wang Dong Zhang 

机构地区:[1]School of Information Science and Technology,Beijing Forestry University,Beijing,100083,China [2]Engineering Research Center for Forestry-Oriented Intelligent Information Processing,National Forestry and Grassland Administration,Beijing,100083,China [3]School of Ecology and Nature Conservation,Beijing Forestry University,Beijing,100083,China

出  处:《Computers, Materials & Continua》2025年第5期2989-3005,共17页计算机、材料和连续体(英文)

基  金:funded by 5·5 Engineering Research&Innovation Team Project of Beijing Forestry University(No.BLRC2023C02).

摘  要:Named entity recognition(NER)in musk deer domain is the extraction of specific types of entities from unstructured texts,constituting a fundamental component of the knowledge graph,Q&A system,and text summarization system of musk deer domain.Due to limited annotated data,diverse entity types,and the ambiguity of Chinese word boundaries in musk deer domain NER,we present a novel NER model,CAELF-GP,which is based on cross-attention mechanism enhanced lexical features(CAELF).Specifically,we employ BERT as a character encoder and advocate the integration of external lexical information at the character representation layer.In the feature fusion module,instead of indiscriminately merging external dictionary information,we innovatively adopted a feature fusion method based on a cross-attention mechanism,which guides the model to focus on important lexical information by calculating the correlation between each character and its corresponding word sets.This module enhances the model’s semantic representation ability and entity boundary recognition capability.Ultimately,we introduce the decoding module of GlobalPointer(GP)for entity type recognition,capable of identifying both nested and non-nested entities.Since there is currently no publicly available dataset for the musk deer domain,we built a named entity recognition dataset for this domain by collecting relevant literature and working under the guidance of domain experts.The dataset facilitates the training and validation of the model and provides data foundation for subsequent related research.The model undergoes experimentation on two public datasets and the dataset of musk deer domain.The results show that it is superior to the baseline models,offering a promising technical avenue for the intelligent recognition of named entities in the musk deer domain.

关 键 词:Named entity recognition musk deer cross-attention lexicon enhancement 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象