检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郑婷婷 陈翀 白海燕[2] 梁冰[2] Zheng Tingting;Chen Chong;Bai Haiyan;Liang Bing(School of Government Management,Beijing Normal University,Beijing,100875;Institute of Scientific and Technical Information of China,Beijing,100038)
机构地区:[1]北京师范大学政府管理学院,北京100875 [2]中国科学技术信息研究所,北京100038
出 处:《情报杂志》2019年第11期175-180,共6页Journal of Intelligence
摘 要:[目的/意义]文献检索中,特定账号可能以独享和共享的方式被使用。在理解用户信息需求确保个性化服务的精准性的问题上,首先要排除共享账号的群体所产生的各异行为对理解用户需求造成的干扰。因此,需要识别用户的行为边界,即某个账号的访问者是个体还是群体。[方法/过程]从科研用户的日志数据中提取行为习惯和主题偏好两方面特征,构建基于科研用户小数据和随机森林分类的个体用户识别模型,并以国家科技数字图书馆网站为例进行实证研究。[结果/结论]实验表明,提出的方法能够有效识别学术搜索日志中的个体用户,准确率约为92.9%,其中主题一致性是区别个体与群体科研用户的最重要特征。本研究不仅可以帮助识别个体用户和机构用户,优化用户管理,而且为跨设备的同一用户判定提供思路。[Purpose/Significance]In academic search system,user account may be occupied by only one individual or shared by multiple individuals.In order to provide accurate and personalized service,we should remove behaviors produced by shared accounts to better understand information needs of individual.Therefore,it is necessary to identify whether the visitor of an account is individual or non-individual.[Method/Process]Firstly,we extract features of search behavior and literature subject from log of academic user.Then,we propose a method to identify individual based on small data of academic user and random forest classification algorithmFinally,we conduct empirical research on log of National Science and Technology Digital Library.[Results/Conclusions]Experiments show that the random forest algorithm based on the features of search behavior and literature subject is effective in identifying individual accessing from massive log,with the precision of 92.9%.Topic consistency is the most important feature to distinguish individual and non-individual.This study can not only help optimize user management,but also provide ideas to the same user identification from cross-device.
关 键 词:科研用户 学术搜索日志 小数据 个体用户识别 随机森林分类
分 类 号:G252.7[文化科学—图书馆学] TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.143.215.114