基于医学主题词共现网络的链接预测研究  被引量:10

Link Prediction in MeSH Terms Co-occurring Networks

在线阅读下载全文

作  者:宫雪[1] 崔雷[1] Gong Xue;Cui Lei(Library of China Medical University,Shenyang 110122)

机构地区:[1]中国医科大学图书馆,沈阳110122

出  处:《情报杂志》2018年第1期66-71,52,共7页Journal of Intelligence

摘  要:[目的/意义]共词网络作为一类特殊的科学知识网络,不仅能从微观层面揭示科学知识体系内部的实体关系特征,还能以其演化过程反映科学知识概念的增长规律。但目前对于共词网络的研究大多集中于"描述"阶段。对医学主题词的共现网络进行链接预测研究,试图找到一种新的预测科学发展方向的途径。[方法/过程]构建以医学主题词/副主题词为节点的共词网络,抽取没有共现关系的主要主题词/副主题词词对为研究样本,计算各个词对的公共近邻、最短路径等属性值。利用朴素贝叶斯、SMO、J48决策树3种分类算法,对词对的共现关系进行预测,并通过属性选择对各属性的重要性进行排序。[结果/结论]3种算法中朴素贝叶斯算法得到了正确预测的共现词对。资源配置指标及Adamic-Adar指数的加权形式指标在预测中起到了更重要的作用。通过链接预测技术,预测两个词在下一个时段是否能够共现。[Purpose/Significance]As a type of special science knowledge network,co-word network can represent the cognitive structure of science from the micro aspect and its evolution mechanism can reflect the law of knowledge conception growth.However,most current research on the co-word network is descriptive.If we can make a good prediction on co-word networks,it may mean we can predict the future direction of scientific development.[Method/Process]The MeSH terms co-occurring networks were constructed.The word pairs which did not co-occur in the same paper published were extracted,and the feature values such as common neighbors,preferential attachment and shortest distance were computed.Naive Bayes,SMO and J48 were performed.Each feature was evaluated through some feature ranking algorithms.[Result/Conclusion]Among the three classification algorithms,Naive Bayes performed best in predicting the positive class.In the ranking of attributes,resource allocation and weighted Adamic-Adar index were significant in this study.We try to predict to a certain extent whether the two words could co-occur in the same paper later and find a new method of knowledge discovery to forecast the future directions of scientific development using the technology of link prediction.

关 键 词:链接预测 共词网络 机器学习 知识发现 

分 类 号:G250[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象