检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵君怡 周鹏[1] 余心杰[2] ZHAO Junyi;ZHOU Peng;YU Xinjie(College of Electrical and Information Engineering,Hubei School of Automotive Technology,Shiyan Hubei,442002;Ningbo Institute of Technology,Zhejiang University,Ningbo Zhejiang,315100)
机构地区:[1]湖北汽车工业学院电气与信息工程学院,湖北十堰442002 [2]浙大宁波理工学院,浙江宁波315100
出 处:《山西大同大学学报(自然科学版)》2025年第1期19-25,共7页Journal of Shanxi Datong University(Natural Science Edition)
基 金:宁波高新区2023年重大科技专项[2023CX050007]。
摘 要:目的 解决档案命名实体识别中的专业术语理解问题,提高数字档案管理的效率和准确性。方法 针对档案领域,提出一种基于RoBERTa-BiLSTM-GCN-CRF的命名实体识别模型。首先通过预训练模型RoBERTa使向量获得丰富的语义信息,解决档案专业术语问题,然后将包含的语义信息传送至双向长短期记忆网络(BiLSTM)模型增强模型对序列信息的理解,其次,利用图卷积神经网络(GCN)模型捕捉文本中词与词之间的复杂关系,最后利用条件随机场(CRF)模型输出实体标签。结果 收集并整理浙江省宁波市档案馆提供的低密级档案文本,经过数据预处理,形成了可用于实体识别实验的训练集、验证集和评价集数据。RoBERTa-BiLSTM-GCN-CRF模型的精确率为96.20%、召回率为95.83%、F1为96.02%,相比现有模型得到有效提升。结论 RoBERTa-BiLSTMGCN-CRF模型在档案实体识别的效果明显,有效解决档案命名实体识别中的挑战。Objective To solve the problem of terminology understanding in archives named entity recognition and improve the efficiency and accuracy of digital archives management.Methods We propose a named entity recognition model based on RoBERTa-BiLSTM-GCN-CRF for archival domain.Firstly,the pre-trained model RoBERTa was used to make the vector obtain rich semantic information to solve the problem of archival terminology.Secondly,the Graph Convolutional Neural Network(GCN)model was used to capture the complex relationship between words in the text.Finally,the Conditional Random Field(CRF)model was used to output the entity labels.Results The low-secret documents provided by Ningbo Archives in Zhejiang province were collected and sorted out.After data preprocessing,the training set,validation set,and evaluation set data that can be used for entity recognition experiments were formed.The RoBERTa-BiLSTM-GCN-CRF model has the accuracy of 96.20%,the recall rate of 95.83%,and the F1 rate of 96.02%,which are effectively improved compared with the existing models.Conclusion The RoBERTa-BiLSTM-GCN-CRF model has obvious effect on archival named entity recognition,and effectively solves the challenge of archival named entity recognition.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15