检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张力元 王军[2] ZHANG Liyuan;WANG Jun
机构地区:[1]北京大学图书馆,北京100871 [2]北京大学信息管理系,北京100871
出 处:《中国图书馆学报》2022年第2期47-61,共15页Journal of Library Science in China
基 金:国家自然科学基金国际重点合作项目“中国儒家学术史知识图谱构建研究”(编号:72010107003)的研究成果之一。
摘 要:目录是组织与利用古籍资源的重要工具,也是图书情报学科的重点研究对象。互著与别裁作为古典目录学中的两种辅助方法,能在深入剖析文献内容特征的基础上,根据内容的多元性将文献准确、完整地记载于目录体系中,达到“类例既分,学术自明”的效果。将互著与别裁映射为文本挖掘中的文本分类问题,提出基于机器学习以实现互著与别裁的方法框架,为古籍在目录体系中的多类目记载提供方法。首先利用TextCNN与BERT两种机器学习模型对先秦诸子六家十部典籍文本进行分类训练,结果显示BERT优于TextCNN,可以达到91.64%的分类准确率;之后用微调训练后的BERT模型对《荀子》与《管子》进行篇、章粒度的分类判断,最终得出这两部图书各篇章互著与别裁的结果。本研究展现了在数字人文视域下,数字技术对古典目录学、古典文献学以及学术史研究的应用价值。图5。表7。参考文献43。Bibliography is an important tool to organize and utilize ancient books and is also the key research object of library and information science.As two auxiliary methods in classical bibliography,inter record and analytic record aim at recording the documents accurately and completely in the bibliography system according to the diversity of contents on the basis of the in-depth analysis of the content characteristics of the documents so as to achieve the function of“once classes are divided,academic ideas become clear”.However the traditional methods of inter record and analytic record are mainly completed by human beings,which leads to several issues such as low efficiency,high cost,weak scientificity,weak objectivity and poor reliability,etc.This study aims at introducing machine learning to classical bibliography under the perspective of digital humanities to provide a new implementation strategy forinter record and analytic record.The study first proposes to map inter record and analytic record of classical bibliography to the problem of text classification and puts forward a method framework based on machine learning to contribute countermeasures for the multi-category record of ancient books in the bibliographic system.This study takes the pre-Qin schools and representative books as the object to verify the method.Two machine learning models,namely TextCNN and BERT are used to classify ten ancient books from six schools of pre-Qin dynasties,and each school has one or two books correspondingly.The classification result shows that fine-tuned BERT model outperformed TextCNN model and can achieve 91.64%of classification accuracy.The fine-tuned BERT model is then used to classify two controversial books,namely Xunzi and Guanzi,and the results of inter record and analytic record of two books are obtained.It is suggested that Xunzi should be classified to Legalism category and Confucianism category,while Guanzi should only be classified to Legalism category.In particular,this study also finds that the first 26 c
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.72.3