计算人文视域下的《史记》三家注引书知识标注与计量分析初探  

Knowledge Annotation and Quantitative Analysis of Citations of the Three Commentaries on Records of the Grand Historian in the Perspective of Computational Humanities

在线阅读下载全文

作  者:齐月 刘雏菲 李文祺 孟凯 王东波[1] 刘浏 QI Yue;LIU Chufei;LI Wenqi;MENG Kai;WANG Dongbo;LIU Liu

机构地区:[1]南京农业大学信息管理学院,人文与社会计算江苏省高校哲学社会科学重点研究基地,南京农业大学领域知识关联研究中心,江苏南京210095 [2]南京农业大学马克思主义学院,江苏南京210095

出  处:《大学图书馆学报》2024年第5期64-77,共14页Journal of Academic Libraries

基  金:国家自然科学基金青年项目“基于深度学习的典籍引书知识图谱构建及应用研究”(项目编号:72004095);国家社会科学基金重大项目“中国古代典籍跨语言知识库构建及应用”(项目编号:21&ZD331)的研究成果之一。

摘  要:基于古籍文本知识挖掘和知识库构建、围绕数据分析与可视化呈现等视角展开的计算人文探索,已逐渐成为古籍保护和研究利用的重要方向。计算人文视域下的古籍引书研究能够为传统研究问题带来新思路与新技术,拓宽古籍研究视角,提供可靠数据支撑。本文以人工标注结合深度学习的方法,对《史记》三家注中的引书知识进行了标注研究,随后分别从分类视角和三家注对比的视角出发统计并呈现了引书和引用作者的分布规律。本研究以《史记》三家注为例,形成了一套完整的古籍引书知识标注技术流程和框架,并将统计计量和可视化分析方法引入了古籍引书研究,对于推动和完善《史记》研究和古籍引书研究均具有参考价值。This study aims to explore a new method for studying the Three Commentaries on Records of the Grand Historian from the perspective of computational humanities.Focusing on the phenomenon of citations in the Three Commentaries on Records of the Grand Historian,this study combined manual and deep learning methods from the perspective of digital humanities to organize the content and knowledge in the Three Commentaries on Records of the Grand Historian and constructed a knowledge base of ancient citations.By building a named entity recognition model based on deep learning,this study automatically proofread the manual annotation results.Combined with another round of manual proofreading and entity disambiguation,it constructed a knowledge base of citations in the Three Commentaries on Records of the Grand Historian to conduct a comprehensive statistical analysis.It started from two types of citations,namely,citing books and citing authors.With the five parts of the Records of the Grand Historian(Annals,Chronological Tables,Treatises,Hereditary Houses,and Biographies) as the types of classification,it carefully examined the frequency-type law of citations,especially the scattered-concentrated and dense-sparse distribution phenomena.It also used statistical data to construct a frequency-type box plot to examine the data anomalies and data skewness in the citations.The study further examined and compared the linear distribution law of citations.At the same time,examples were given to analyze high-frequency cited books and to explain the causes of statistical laws.In addition,with comparative study of the three commentaries,the above-mentioned citation phenomenon and distribution law were examined and analyzed in more detail.The research forms a complete set of technical processes and frameworks for the annotation of ancient book citations through annotation specifications,knowledge annotation,knowledge base construction,knowledge measurement and analysis.The knowledge base constructed has important resource value for the

关 键 词:计算人文 数字人文 古籍引书 《史记》三家注 文本知识挖掘 

分 类 号:G250[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象