中文期刊论文数据治理工作实践——以挖掘中国知网题录及PDF文档为例  被引量:1

Practice of Data Governance of Chinese Journal Papers——Take the Mining of CNKI Bibliographical References and PDF Documents as An Example

在线阅读下载全文

作  者:朱玉强[1] 范翠丽 ZHU Yu-qiang;FAN Cui-li(Library of Shandong Normal University,Jinan 250014,China;Shandong Science and Technology Press,Jinan 250002,China)

机构地区:[1]山东师范大学图书馆,济南250014 [2]山东科学技术出版社,济南250002

出  处:《西安文理学院学报(自然科学版)》2021年第4期114-122,共9页Journal of Xi’an University(Natural Science Edition)

摘  要:为探讨数据治理工作在既定规范下取得相同或相近质量成果的可能性,尝试提升数据治理手段和工具的智能化与自动化,以某高等学校在中文期刊发表论文数据治理工作实践为例,使用Python编写程序自动挖掘从中国知网下载的题录及PDF文档内容,实现署名作者身份甄别、业绩点核算、文章所属期刊被各类评价体系收录情况统计等工作.结果表明,所编写程序自动化程度高,基本达到预期效果,在数据治理工作尚无成熟统一的大平台支撑背景下,编写个性化数据治理工具大有可为.In this paper,in order to explore the possibility of data governance to achieve the same or similar quality results under the established specifications,and try to improve the intelligence and automation of data governance means and tools,taking the data governance practice of a university's papers published in Chinese journals as an example,Python program is used to automatically mine the bibliographical references and PDF documents downloaded from CNKI,so as to realize the identification of signed authors,accounting of performance points,and statistics of journals included by various evaluation systems,etc.The results show that the program has a high degree of automation,and basically achieves the expected effect.Under the background of the lack of mature and unified platform support for data governance,there is great potential for programming personalized data governance tools.

关 键 词:数据治理 文本挖掘 PYTHON 

分 类 号:G251[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象