检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张晓娟 郭佳润 杨诗涵 桂思思 Zhang Xiaojuan;Guo Jiarun;Yang Shihan;Gui Sisi(School of Public Administration,Sichuan University,Chengdu 610065;College of Information Management,Nanjing Agricultural University,Nanjing 210095)
机构地区:[1]四川大学公共管理学院,成都610065 [2]南京农业大学信息管理学院,南京210095
出 处:《情报学报》2025年第4期482-494,共13页Journal of the China Society for Scientific and Technical Information
基 金:国家社会科学基金一般项目“时间感知的个性化学术文献引文推荐研究”(21BTQ072)。
摘 要:在学术搜索系统中,根据某学术用户历史搜索行为对该用户在下一时间段中所需文献的数量和时间进行预测,有助于提升用户对学术文献推荐结果的满意度。本文通过挖掘学术用户各类行为序列特征提高学术用户下载行为(下一下载session中的下载次数以及距下一下载session的时间间隔)预测的准确度。首先,本文将学术用户下载行为预测问题转化为时间序列预测问题;其次,分别从学术用户查询重构行为、查询表达式与下载行为三个角度抽取特征,并在此基础上利用LSTM(long short-term memory)模型将学术用户历史session建模为时间序列,从而实现对下载行为的预测;最后,对比分析本文提出特征与已有研究提出特征的预测性能,分别探讨不同特征集合以及单个特征的预测效果。本文提出的特征能提高预测任务的准确度,基于对不同学术用户的聚类,在不同类簇上训练得到的LSTM模型具有最佳的整体预测性能。其中,查询表达式相关特征对下一下载session中的下载次数预测效果最佳,下载行为相关特征对距下一下载session的时间间隔预测效果最佳。In academic search systems,using historical search behavior to predict both the quantity of literature an academic user may need and the timing of that need within a given time period will help improve user satisfaction with literature recommendations.To improve the prediction accuracy of the download behaviors of academic users,the number of downloads in the next download session and the time gap until the next download session are evaluated by mining various behavioral sequence features of academic users.First,this study transforms the problem of predicting user download behavior into a time series prediction problem.Subsequently,based on the mining of the behavioral features from three perspectives,namely the query reformulation behaviors,query expressions,and download behaviors of users,the long short-term memory(LSTM)model is used to model user historical sessions as a time series to predict the download behavior.Finally,a comparative analysis of the predictive performance between the features proposed in the article and those proposed in existing research is conducted,and the predictive performances of different sets of features and individual features are explored.The features proposed in this study can improve the accuracy of prediction tasks.By clustering different users,the LSTM models that were trained on different clusters displayed the best overall predictive performance. Among all feature sets, the query expression-based features achieved the best prediction performance for the number of downloads in the next download session, and the download behavior-based features exhibited the most outstanding performance gain for the pre‐diction of the time interval until the next download session. Owing to the limited availability of more public log datasets on academic user search behavior, this study conducted experimental validation on one dataset. Hence, the insufficient user be‐havior data provided by the dataset logs poses a limitation to the feature engineering approach employed in this study.
关 键 词:学术用户 文献下载行为预测 日志会话 学术搜索 特征挖掘
分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.179