面向科学知识发现的造血干细胞知识图谱构建研究  被引量:1

Generating a Hematopoietic Stem Cell Knowledge Graph for Scientific Knowledge Discovery

在线阅读下载全文

作  者:胡正银[1,2] 刘蕾蕾 陈文杰 刘春江 钱力[2,3] 宋亦兵 HU Zhengyin;LIU Leilei;CHEN Wenjie;LIU Chunjiang;QIAN Li;SONG Yibing(Chengdu Library and Information Centre,Chinese Academy of Sciences,Chengdu,Sichuan 610041,China;Department of Library,Information and Archives Management,School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100190,China;National Science Library,Chinese Academy of Sciences,Beijing 100190,China;Guangzhou Institutes of Biomedicine and Health,Chinese Academy of Sciences,Guangzhou,Guangdong 510530,China)

机构地区:[1]中国科学院成都文献情报中心,四川成都610041 [2]中国科学院大学,经济与管理学院,图书情报与档案管理系,北京100190 [3]中国科学院文献情报中心,北京100190 [4]中国科学院广州生物医药与健康研究院,广东广州510530

出  处:《数据与计算发展前沿》2021年第6期81-97,共17页Frontiers of Data & Computing

基  金:National Key Research and Development Program“Application demonstration of comprehensive science and technology services for typical industries in Pearl River Delta Urban Agglomeration”(Grant No:2018YFB1404205);the Ministry of Science and Technology Innovation Methods Special Project(Grant No:2019IM020100)。

摘  要:【目的】造血干细胞(HSC)是临床治疗最有效的干细胞之一,通过文献挖掘发现领域重要的知识实体、知识关系和知识路径对于HSC领域知识发现具有重要意义。知识图谱(KG)是一种新型知识组织技术,支持知识实体、知识关系和知识路径等知识单元的多层次、细粒度、富语义知识组织与知识互联,被广泛应用于科学知识发现(SKD)中。【方法】本文提出了一个基于“主-谓-宾”(SPO)三元组构建领域知识图谱的框架,该框架包括文献检索、SPO提取、SPO清洗、SPO排序、知识发现模式集成和图谱构建等过程。然后,基于该框架构建了HSC知识图谱。最后,基于HSC知识图谱,介绍了“开放式知识发现”、“封闭式知识发现”与“研究主题挖掘”三种HSC领域SKD场景。【结果】结果表明,利用该框架构建的HSC知识图谱具有“使用图数据结构”、“集成知识发现模式”、“融合原生图挖掘算法”和“易于使用”等优点,可以有效地支持HSC领域知识发现。[Objective]The hematopoietic stem cell(HSC)is one kind of the most effective stem cells for clinical treatments.It is of great significance to discover important knowledge entities,knowledge relations,and knowledge paths by literature mining for HSC knowledge discovery.Knowledge graph(KG),which represents knowledge entities and their relations with more details in a simple manner is widely used in scientific knowledge discovery(SKD).[Methods]This paper proposes a framework of generating KG using Subject-Predicate-Object(SPO)triples from literature,which includes six processes:literature retrieval,SPO extracting,SPO cleanup,SPO ranking,discovery pattern integrating,and graph building.Then,an HSC KG was constructed based on the Neo4j graph database following the framework.Finally,three kinds of SKD scenarios using HSC KG are introduced by empirical analysis.[Results]The results show that HSC KG has the advantages of“using graph data structure”,“integrating discovery patterns”,“fusing native graph mining algorithms”,and“easy to use”,which can effectively support deep open discovery,close discovery,and topic discovery in HSC.

关 键 词:知识图谱 SPO三元组 科学知识发现 文献挖掘 造血干细胞 

分 类 号:R457.7[医药卫生—治疗学] TP391.1[医药卫生—临床医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象