检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孟佳娜 许英傲 赵丹丹 李丰毅 赵迪 Meng Jiana;Xu Yingao;Zhao Dandan;Li Fengyi;Zhao Di(School of Computer Science and Engineering,Danlian Minzu University,Dalian 116600)
机构地区:[1]大连民族大学计算机科学与工程学院,大连116600
出 处:《知识管理论坛》2024年第6期533-546,共14页Knowledge Management Forum
基 金:教育部人文社会科学研究规划基金项目“基于知识图谱的中华文化互联网智慧传播研究”(项目编号:23YJA860010);中央高校基本科研业务费资助基金项目“基于大模型和知识驱动的情感分析研究”(项目编号:140250)研究成果之一
摘 要:[目的/意义]利用命名实体识别技术深入挖掘古籍文献,推动中文古籍数字化进程,对于推动历史学习、增强文化自信以及弘扬中国传统文化具有重要意义。[方法/过程]提出多粒度特征融合的古文命名实体识别方法,以《左传》为研究语料,构建人名、地名、时间等命名实体识别任务。首先,将古文字信息、词性信息及字形特征融合,提高输入特征表示能力;然后,在加入预测实体头尾辅助任务学习古句边界信息的同时利用Transfer交互器启发式学习古文实体构词规律,并用BiLSTM和IDCNN联合抽取上下文信息;最后,将学习到的多种古文特征加权融合,输入CRF中进行实体预测。[结果/结论]实验结果表明,多粒度特征融合的古文命名实体识别方法,相比主流的BERT-BiLSTM-CRF模型,精确率、召回率和F1值分别提升5.09%、13.45%和9.87%。多粒度特征融合的古文命名实体识别方法能够精准地实现对古籍文本的命名实体识别。[Purpose/Significance]Leveraging Named Entity Recognition(NER)techniques for the thorough exploration of ancient literary documents not only drives forward the digitization of ancient Chinese texts,including the vital process of Ancient text digitization,which is crucial for historical studies,bolstering cultural confidence,promoting traditional Chinese culture,and advancing Named Entity Recognition(NER)as a foundational task in NLP.[Method/Process]A method for named entity recognition in classical Chinese texts with multi-granularity feature fusion was proposed,Leveraging“Zuo Zhuan”as the research corpus and formulating named entity recognition tasks for personal names,geographical names,temporal entities,etc.Initially,ancient character information,part-of-speech(POS)information,and glyph features were integrated to enhance input feature representation.Subsequently,auxiliary tasks for predicting entity boundaries were introduced,alongside the utilization of a Transfer Interactor heuristic to learn classical Chinese entity formation rules.This was complemented by joint contextual information extraction using BiLSTM and IDCNN(Iterated Dilated Convolutional Neural Network).Finally,learned features were weighted and merged into a CRF(Conditional Random Field)for entity prediction.[Result/Conclusion]Experimental results demonstrate that the proposed method of multi-granularity feature fusion for named entity recognition in classical Chinese texts enhances precision,recall,and F1 score by 5.09%,13.45%,and 9.87%,respectively,compared to the mainstream BERT-BiLSTM-CRF method.Multi-granularity feature fusion for named entity recognition in classical Chinese texts is crucial for accurately identifying named entities in ancient texts.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147