基于语义网络的研究兴趣相似性度量方法  被引量:11

Similarity Measurement of Research Interests in Semantic Network

在线阅读下载全文

作  者:巴志超 李纲[1] 朱世伟[2] 

机构地区:[1]武汉大学信息管理学院,武汉430072 [2]山东省科学院情报研究所,济南250014

出  处:《现代图书情报技术》2016年第4期81-90,共10页New Technology of Library and Information Service

基  金:国家自然科学基金项目"科研团队动态演化规律研究"(项目编号:71273196);山东省重点研发计划项目"可定制大数据知识服务平台关键技术研究及应用"(项目编号:2015GGX101037);山东省科学院青年基金项目"基于本体标注的科技文档挖掘方法关键技术研究"(项目编号:2013QN036)的研究成果之一

摘  要:【目的】为准确识别研究内容相似但使用不同关键词的作者关系,解决传统共现分析方法缺乏语义关联的问题,提出一种基于关键词语义网络构建的作者研究兴趣相似性度量方法。【方法】通过引入word2vec模型对作者关键词进行词向量表示,将关键词表示成语义级别的低维实值分布;计算关键词之间的语义相关度并构造关键词语义网络,采用JS距离对构建的作者研究兴趣矩阵进行相似性度量。【结果】该方法能计算出共现及非共现词对的相关性,有效地挖掘出作者之间的潜在合作关系。【局限】训练语料的数量和准确性有待进一步提高,提出的度量方法仅考虑两个作者之间的潜在合作关系。【结论】研究结果对改进基于传统的共现分析方法度量作者合作关系具有重要的参考价值。[Objective] This study aims to identify relationship among authors of papers with similar contents but different keywords, and then tries to add more sematic factors to the co-occurrence analysis. [Methods] We proposed a method to gauge the similarity of research interests based on the keywords semantic network system. First, all keywords were represented as word vectors and translated into low dismension distribution with the help of neural network language—word2vec model. Second, we calculated the semantic association of keywords to build up a semantic network. Finally, we adopted the Jensen-Shannon distance method to measure the similarity of research interests. [Results] The proposed approach can accurately identify the similarities of co-occurrence and non co-occurrence terms and then effectively predict potential cooperation among authors. [Limitations] The amount and accuracy of training materials need to be increased. At present, we could only find potential cooperation between two authors. More research is needed to explore the possibilities of cooperation among multi-authors. [Conclusions] The proposed method could help to improve the performance of traditional co-occurrence analysis.

关 键 词:网络 神经网络语言模型 语义相似度 研究兴趣矩阵 

分 类 号:G353.1[文化科学—情报学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象