一种专利知识图谱的构建方法  被引量:5

Methods of Patent Knowledge Graph Construction

在线阅读下载全文

作  者:邓亮 曹存根[4] DENG Liang;CAO Cun-gen(School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China;Shenyang Institute of Computing Technology,Chinese Academy of Sciences,Shenyang 110168,China;Patent Office,China National Intellectual Property Administration,Beijing 100083,China;Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China)

机构地区:[1]中国科学院大学计算机科学与技术学院,北京100049 [2]中国科学院沈阳计算技术研究所,沈阳110168 [3]国家知识产权专利局,北京100083 [4]中国科学院计算技术研究所,北京100190

出  处:《计算机科学》2022年第11期185-196,共12页Computer Science

摘  要:专利知识图谱对专利精准检索、专利深度分析和专利知识培训等应用起到了重要作用。文中提出了一种实用的基于种子知识图谱、文本挖掘以及关系补全的专利知识图谱构建方法。在该方法中,为确保质量,首先人工建立一个种子专利知识图谱,然后采用专利文本模式的概念和关系抽取方法扩展种子专利知识图谱,最后对扩展的专利知识图谱进行定量评估。文中针对中医药领域专利进行了种子知识的人工提取和词法句法模式的人工总结,并使用机器学习的方法在学习到新的词法句法模式后对种子专利知识图谱进行扩展和图谱补全。实验结果表明,中医药领域专利种子知识图谱中的节点数和关系数分别为19453个和194775条,经过扩展后,它们分别达到了558461个和7275958条,即分别增加了27.7倍和36.3倍。Patent knowledge graph plays a important role in patent accurate retrieval,patent in-depth analysis and patent know-ledge training.This paper proposes a practical patent knowledge graph construction method based on seed knowledge graph,text mining and relationship completion.In this method,to ensure the quality,a seed patent knowledge graph is first established ma-nually,then the concept and relation extraction method of patent text pattern is used to expand the seed patent knowledge graph,and finally the extended patent knowledge graph is quantitatively evaluated.In this paper,artificial extraction of seed knowledge and manual summarization of lexical and syntactic patterns are carried out for patents in the field of traditional Chinese medicine.After obtaining new lexical and syntactic patterns by machine learning,the knowledge graph of seed patent is expanded and completed.Experimental results show that the number of nodes and relationships in the knowledge graph of traditional Chinese medicine are 19453 and 194775 respectively.After expansion,they reach 558461 and 7275958 respectively,representing an increase of 27.7 and 36.3 folds respectively.

关 键 词:专利文本 专利知识图谱 词法句法分析 表示学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象