基于机器学习的古籍目录互著与别裁探析被引量：11

Research on Inter Record and Analytic Record of Classical Bibliography Based on Machine Learning

作　　者：张力元王军[2] ZHANG Liyuan;WANG Jun

机构地区：[1]北京大学图书馆,北京100871 [2]北京大学信息管理系,北京100871

出　　处：《中国图书馆学报》2022年第2期47-61,共15页Journal of Library Science in China

基　　金：国家自然科学基金国际重点合作项目“中国儒家学术史知识图谱构建研究”(编号:72010107003)的研究成果之一。

摘　　要：目录是组织与利用古籍资源的重要工具,也是图书情报学科的重点研究对象。互著与别裁作为古典目录学中的两种辅助方法,能在深入剖析文献内容特征的基础上,根据内容的多元性将文献准确、完整地记载于目录体系中,达到“类例既分,学术自明”的效果。将互著与别裁映射为文本挖掘中的文本分类问题,提出基于机器学习以实现互著与别裁的方法框架,为古籍在目录体系中的多类目记载提供方法。首先利用TextCNN与BERT两种机器学习模型对先秦诸子六家十部典籍文本进行分类训练,结果显示BERT优于TextCNN,可以达到91.64%的分类准确率;之后用微调训练后的BERT模型对《荀子》与《管子》进行篇、章粒度的分类判断,最终得出这两部图书各篇章互著与别裁的结果。本研究展现了在数字人文视域下,数字技术对古典目录学、古典文献学以及学术史研究的应用价值。图5。表7。参考文献43。Bibliography is an important tool to organize and utilize ancient books and is also the key research object of library and information science.As two auxiliary methods in classical bibliography,inter record and analytic record aim at recording the documents accurately and completely in the bibliography system according to the diversity of contents on the basis of the in-depth analysis of the content characteristics of the documents so as to achieve the function of“once classes are divided,academic ideas become clear”.However the traditional methods of inter record and analytic record are mainly completed by human beings,which leads to several issues such as low efficiency,high cost,weak scientificity,weak objectivity and poor reliability,etc.This study aims at introducing machine learning to classical bibliography under the perspective of digital humanities to provide a new implementation strategy forinter record and analytic record.The study first proposes to map inter record and analytic record of classical bibliography to the problem of text classification and puts forward a method framework based on machine learning to contribute countermeasures for the multi-category record of ancient books in the bibliographic system.This study takes the pre-Qin schools and representative books as the object to verify the method.Two machine learning models,namely TextCNN and BERT are used to classify ten ancient books from six schools of pre-Qin dynasties,and each school has one or two books correspondingly.The classification result shows that fine-tuned BERT model outperformed TextCNN model and can achieve 91.64%of classification accuracy.The fine-tuned BERT model is then used to classify two controversial books,namely Xunzi and Guanzi,and the results of inter record and analytic record of two books are obtained.It is suggested that Xunzi should be classified to Legalism category and Confucianism category,while Guanzi should only be classified to Legalism category.In particular,this study also finds that the first 26 c

关键词：古籍目录互著别裁机器学习数字人文

分类号：G257[文化科学—图书馆学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于机器学习的古籍目录互著与别裁探析被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于机器学习的古籍目录互著与别裁探析 被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于机器学习的古籍目录互著与别裁探析被引量：11