科技论文创新知识图谱构建与应用设计——以情报学期刊论文为例  

Construction and Application Design of Innovation Knowledge Graph in Scien tific Papers-A Case Study of Information Science Journal Papers

在线阅读下载全文

作  者:曹树金 闫颂[3] 李睿婧 曹茹烨 CAO Shujin;YAN Song;LI Ruijing;CAO Ruye

机构地区:[1]山东理工大学,山东淄博255000 [2]山东理工大学信息管理学院 [3]中山大学信息管理学院,广东广州510006

出  处:《中国图书馆学报》2025年第1期82-100,共19页Journal of Library Science in China

摘  要:对科技论文中创新知识及知识间的关联关系进行识别与探索,能够揭示各创新成果完整的创新脉络,帮助科研人员了解全局性、整体性的创新网络关系。本研究基于深度学习模型、语义信息对科技论文全文进行创新知识实体与关系抽取,构建科技论文创新知识图谱,以回答如何聚焦和挖掘科技论文中最能支持创新的精准知识和情报这一科学问题。以情报学重要中文期刊的科技论文全文为研究对象,首先,通过BERT-BiLSTM-CRF和RoBERTa-BiLSTM-CRF模型框架、语义角色信息对科技论文中的创新“贡献”“过程”“工具”“方法”“理论”等九种类型创新知识实体进行抽取;其次,基于正则表达式等方法完成实体间“创新基础”“创新贡献”“创新过程”等七种关系的抽取;最后,通过Neo4j图数据库完成知识图谱的存储,并在此基础上进行应用平台的设计。实验结果表明,RoBERTa-BiLSTM-CRF模型更适用于本研究的命名识别任务;构建的科技论文创新知识图谱能够有效揭示隐藏在论文全文中的创新知识关系网络,为科技创新提供精准的知识和情报支持,从创新维度深度挖掘科技论文的价值。Identifying and exploring the correlation among innovation knowledge in scientific papers can reveal the complete innovation network of each achievement.This helps researchers understand the global and overarching innovation relationships,reflect existing innovation laws and methods,and reveal the potential innovation paths based on latent relationships among innovation elements.Such insights can serve as a reference for further innovation research and practice,ultimately enhancing the efficiency of scientific and technical innovation.In this study,we extract innovation knowledge entities and relationships from the full text of scientific papers using deep learning models and semantic information.We then construct an innovation knowledge graph to address the scientific question of how to leverage the knowledge graph to mine accurate knowledge and information that directly supports innovation in scientific papers.Using the full text of papers from prominent Chinese journals in information science as our research corpus,we first extract nine types of innovative knowledge entities—such as“contribution”,“process”,“tool”,“method”,“theory”,etc.—via a BERT-BiLSTM-CRF and RoBERTa-BiLSTM-CRF model framework,incorporating semantic role information.Next,seven relationships between entities,including“innovation basis”,“innovation contribution”,“innovation process”,etc.,are extracted using regular expressions and other methods.Finally,the resulting knowledge graph is stored in a Neo4j graph database,with an application platform designed on this basis.Experimental results show that,compared with the BERT model,the optimization and improvement made by RoBERTa significantly enhance performance for the task in this study.The RoBERTa-BiLSTM-CRF model is more suitable for the naming recognition task.Furthermore,the constructed innovation knowledge graph in scientific papers is both feasible and effective.It reveals the hidden networks of innovation knowledge relationships embedded in the full tex

关 键 词:科技论文 创新知识图谱 深度学习 科技创新情报 创新知识挖掘 

分 类 号:G254.29[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象