迁移学习视角下红色文献元数据表示体系构建探究  

A Probe into the Construction of Red Literature Metadata Representation System from the Perspective of Transfer Learning

在线阅读下载全文

作  者:武帅 何琳[1] 杨海龄 陆滢洁 Wu Shuai;He Lin;Yang Hailing;Lu Yingjie(College of Information Management,Nanjing Agricultural University,Jiangsu,210095;Yuhuatai Red Culture Research Institute,Nanjing,210012)

机构地区:[1]南京农业大学信息管理学院,江苏210095 [2]雨花台红色文化研究院,南京210012

出  处:《情报资料工作》2024年第6期84-92,共9页Information and Documentation Services

基  金:国家社会科学基金重点项目“雨花英烈文献知识组织及智能内容生成研究”(批准号:23TQA00341);南京农业大学中央高校基本科研业务费项目“新技术视角下的雨花英烈革命文献挖掘研究”(项目编号:SKCX2023007);江苏省研究生科研与实践创新计划项目“面向革命人物传记体的人物关系抽取及推理研究”(项目编号:KYCX24_1013)的研究成果之一。

摘  要:[目的/意义]红色文献数量庞大、类型多样,给红色文献细粒度标注带来诸多困难,如何利用少量已有标注数据开展红色文献的自动化数据标注研究是促进红色文献智能化处理与应用研究中亟待解决的重要问题之一。文章设计红色文献的元数据表示体系,并尝试利用迁移学习技术对未标注文献进行标注实验,实现红色文献元数据知识的自动化标注。[方法/过程]首先,构建融合预训练模型、双向长短时记忆网络、多头注意力机制和条件随机场的BERT-BiLSTM-MHA-CRF模型;其次,根据红色文献的内容特征,设计多组红色文献的元数据表示体系;最后,探究在不同类型的红色文献自动化标注任务中,元数据表示体系与迁移学习模型的适配性。[结果/结论]MRS 6元数据表示体系可作为通用型红色文献的元数据表示体系,“BERT-BiLSTM-MHA-CRF+MRS 6”适用于不同场景的红色文献自动化标注,在同类型红色文献的自动化标注实验中具有较好的泛化能力。[Purpose/significance]The huge quantity and diverse types of red literature bring many difficulties to the fine-grained annotation of red literature,and how to utilize a small amount of existing annotated data to carry out the research on automated data annotation of red literature is one of the important problems to be solved in the promotion of intelligent processing and application research of red literature.We design the metadata representation system of red literature and try to use transfer learning technology to conduct annotation experiments on unannotated literature to realize the automated annotation of metadata of red literature.[Method/process]Firstly,BERT-BiLSTM-MHA-CRF model incorporating pre-trained model,bi-directional long and short-term memory network,multi-head attention and conditional random field was constructed.Secondly,according to the content characteristics of red literature,the metadata representation system of multiple groups of red literature is designed.Finally,the suitability of metadata representation systems with transfer learning models in different types of automated red literature annotation tasks was explored.[Result/conclusion]The MRS 6 Metadata Representation System can be used as a metadata representation system for generic red literature.The“BERT-BiLSTM-MHA-CRF+MRS 6”is suitable for the automated annotation of red documents in different scenarios,and it has good generalization ability in the automated annotation experiments of the same type of red documents.

关 键 词:迁移学习 红色文献 元数据 自动标注 BERT-BiLSTM-MHA-CRF 

分 类 号:G254[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象