基于词云图和层次聚类的天然产物研究热点分析  被引量:8

Research hotspots of natural products base on co-word analysis

在线阅读下载全文

作  者:倪冰苇 赵鸿萍[1] 顾月清[1] NI Bing-wei;ZHAO Hong-ping;GU Yue-qing(China Pharmaceutical University,Nanjing 211198,China)

机构地区:[1]中国药科大学,南京211198

出  处:《中国新药杂志》2020年第12期1326-1333,共8页Chinese Journal of New Drugs

摘  要:探索天然产物领域研究热点可以为新药研发、中医药、植物学等多个领域的研究人员指引研究方向。本研究首先利用爬虫从PubMed数据库采集了2019年以来10个天然产物权威杂志刊载的所有文献信息,数据清洗后得到2 278篇文献、8 539个关键词;之后利用Python编程统计词频,对Top100的高频词进行同义词合并处理,得到77个高频词并绘制词云图展示;为了挖掘热点研究方向,后续又从高频词集中剔除了不能反映热点研究方向的词语,得到31个热点关键词;随后建立共词矩阵、相异矩阵,并利用层次聚类法进行分析。论文研究方法可以为其他领域探寻热点研究方向提供参考。Exploring research hotspots in the field of natural products can provide directions for researchers in the fields of new drug research and development, Chinese medicine and botany. This study first used crawler to collect information of 2 490 articles from 10 natural product magazines published from January 1st 2019. Totally 2 278 articles and 8 539 keywords were collected after cleaning the data. Python programming was used to count the word frequency, 77 keywords were obtained after merging the synonyms in the top 100 high-frequency words, and a word cloud map was drawn. In order to explore the hot research direction, 31 hot keywords were obtained from 77 high-frequency words after deleting words that could not reflect the hot research direction. Subsequently, co-word matrix and dissimilar matrix were established for these 31 hot words. Finally, hierarchical clustering method was used to analyze the matrixes. The methods used in this research can provide reference for other fields to explore hotspots.

关 键 词:天然产物 词云图 层次聚类 共词分析 文献计量 

分 类 号:R95[医药卫生—药学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象