机构地区:[1]北京大学信息管理系、北京大学数字人文研究中心,北京100871 [2]中国科学院自然科学史研究所,北京100190 [3]北京大学中文系,北京100871 [4]中华书局文学编辑室,北京100073
出 处:《中国图书馆学报》2023年第1期82-98,共17页Journal of Library Science in China
基 金:国家自然科学基金国际重点合作项目“中国儒家学术史知识图谱构建研究”(编号:72010107003)的研究成果。
摘 要:古籍目录及其分类体系具有重要的学术价值,数字学术的发展为古籍目录的数字化保存和利用以及开展数字工具支持的目录学研究提供了新的契机。本文以时间跨度两千多年的八种史志目录为数据源,以机器预处理与专家校对相结合的人机迭代方式对数据进行记录拆分和字段抽取、数据补全、规范化以及书目认同,最终完成11万余条书目记录的结构化、规范化集成。在此数据集的基础上,从领域专家的研究需求出发,结合统计、可视化、检索等方法,利用人机交互技术构建了一个历代古籍目录可视化分析系统。该系统包括书目统计以及分类演化分析两个主要部分:一方面可对书目数据进行细粒度统计和可视化呈现,以帮助学者清晰地比较、追踪类目的消长;另一方面可对所有典籍在历代目录中的分类演变轨迹以及各类目所收典籍的源流进行可视化分析,以更好地实现类目分合转化的模式识别。本研究为数字学术背景下的目录学研究提供了数据基础和分析工具,不仅为学者省去了大量数据收集、整理的时间,还通过新的技术和视角助力分析、比较等解释性研究。图8。表3。参考文献36。Ancient book catalogs record and classify a large number of Chinese ancient books.They are of great academic value for studying both ancient literature and traditional knowledge organization.The development of digital scholarship shed new light on the digital preservation and reuse of these ancient book catalogs as well as the domain research supported by digital tools.Digital scholarship facilitates the digitization and datafication of ancient book catalogs.Moreover,new methods and computational tools are provided to enable the exploration of large collections,and new research questions can be raised from fresh perspectives.Recent studies have introduced computational methods to analyze the abstracts and classification systems of the ancient book catalogs.But these studies were based on only one catalog or a particular category.It is imperative to integrate the catalogs throughout the history and provide digital tools for scholars to explore and analyze them diachronically and holistically.In this study,we selected eight representative catalogs,mostly from official histories,as data sources.They were Hanshu Yiwenzhi,Suishu Jingjizhi,Jiutangshu Jingjizhi,Xintangshu Yiwenzhi,Songshi Yiwenzhi,Mingshi Yiwenzhi,Qingshigao Yiwenzhi and Siku Quanshu Zongmu.These catalogs cover major dynasties in Chinese history with a time span of more than two thousand years.We adopted a semi-automated data processing approach to integrate the book entries in eight catalogs.The whole integration process was iterated by machine pre-processing and expert manual correction and contained three main steps—record splitting and field segmentation,field completion and normalization and book identification.Eventually we got more than 110000 structured data records,and identified over 7000 books that were recorded in at least two catalogs.Based on the integrated data,we designed and developed an interactive visual analysis system that included features of statistics,visualization and record query.The system is designed to mainly meet two rese
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...