融合依存句法网络和PageRank的检索词推荐方法研究  被引量:3

A Method of Search Term Recommendation Based onDependency Syntactic Network Combined with PageRank

在线阅读下载全文

作  者:楼雯[1,2,3] 马昕钰 苏子龙 Lou Wen;Ma Xinyu;Su Zilong(Department of Information Management,School of Economics and Management,East China Normal University,Shanghai 200062;Institute for Academic Evaluation and Development in East China Normal University,Shanghai 200241;Key Laboratory of Advanced Theory and Application in Statistics and Data Science(East China Normal University),Ministry of Education,Shanghai 200062)

机构地区:[1]华东师范大学经济与管理学院信息管理系,上海200062 [2]华东师范大学学术评价与促进研究中心,上海200241 [3]华东师范大学统计与数据科学前沿理论及应用教育部重点实验室,上海200062

出  处:《情报学报》2023年第11期1358-1368,共11页Journal of the China Society for Scientific and Technical Information

基  金:上海市哲学社会科学基金青年项目“基于大规模学术数据的正式科学用语的异化现象研究”(2021ETQ002)。

摘  要:面对信息过载的深化和交叉研究的兴起等问题,提升信息检索系统的过滤能力是提供有效的检索词推荐服务等相关研究的重要议题。本文提出将PageRank算法与依存句法网络融合进行知识库的检索词推荐,通过构建检索词集合与依存句法网络,采用PageRank算法对检索词排序以实现检索词推荐,用Web of Science中124516篇information science&library science (LIS)领域的文献摘要数据对该方法进行验证。邀请10位LIS领域的图书情报专业硕士研究生进行用户研究,并与已有的相似方法和系统对比。研究结果显示,本文方法推荐准确率为80%,推荐列表表内平均Cosine相似性为0.530,表内平均Jaccard相似性为0.395,表内检索词具有较相似系统更优的多样性和惊喜度等特征,说明该方法能够扩大推荐检索词对用户信息需求的覆盖面,可为信息检索结果的表现方式提供新的参考方法和视角,可直接用于信息检索的后端词汇组织方式,也可间接用于知识发现与跨学科研究。Facing increasing information overload and the rise of cross-cutting research,studies on the filtering ability of the information retrieval system to provide effective search term recommendation services are becoming increasingly important.This study proposes a search term recommendation method that integrates dependent syntax and language network theories using the PageRank algorithm.By constructing a search term set and dependent syntax network,and sorting the search terms using the PageRank algorithm,the search term recommendation is realized.The method is validated using the Web of Science platform 124,516 literature abstracts in the field of information science&library science as an example.We also invited ten MLIS graduate students to participate in the user study,which combined the comparison results with similar methods and systems.The results show that the accuracy of the recommended method is 80%,average Cosine similarity in the recommendation list is 0.53,and average Jaccard similarity in the table is 0.39.Compared with other methods and systems,the diversity of our approach reacts better with a higher degree of surprise.Overall,the results show that our method increased the coverage of the search terms based on the user’s information requirements.Our method is expected to provide references on methodological perspectives on the representation of information retrieval.It can be directly applied for terminological organization in the back end of information retrieval,as well as indirectly for knowledge discovery and inter-disciplinary study.

关 键 词:依存句法分析 语言网络 检索词推荐 信息检索 复杂网络 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象