查询专指度特征分析与自动识别被引量：6

Feature Analysis and Automatic Identification of Query Specificity

机构地区：[1]武汉大学信息管理学院,武汉430072 [2]武汉大学信息资源研究中心,武汉430072

出　　处：《现代图书情报技术》2015年第2期15-23,共9页New Technology of Library and Information Service

基　　金：国家科技支撑计划课题"文化遗产知识本体构建存储可视化技术研究"(项目编号:2012BAH33F03);国家自然科学基金面上项目"基于语言模型的通用实体检索建模及框架实现研究"(项目编号:71173164)的研究成果之一

摘　　要：【目的】基于Sogou查询日志构建人工标注集,实现查询专指度的特征分析与自动识别,并对识别效果进行分析与评测。【方法】选取用户查询串基本特征与内容特征进行统计分析,并分别训练决策树、SVM和朴素贝叶斯分类器对专指度进行自动识别。【结果】使用以上特征的识别效果良好,十折交叉检验的宏平均F-measure均高于0.8。【局限】分类特征的选择未考虑用户点击信息;朴素贝叶斯的独立性假设在本实验中是否可以忽略仍需进一步验证。【结论】利用查询串基本特征和内容特征,可以有效识别弱、略和强专指度查询。[Objective] This paper constructs a human-annotated collection on the basis of Sogou query logs, aims at feature analysis and automatic identifcation of query specificity, as well as evaluates and compares the identifing results. [Methods] The queries＇ basic features and content features are selected and analyzed. And then the decision tree, SVM and Naive Bayes classifiers are built and trained to achieve the automatic query specificity classification. [Results] Using the features mentioned above, an effective query specificty identification is obtained. Finally, the macro average F-measures of the identification effects are all above 0.8. [Limitations] Users＇ clickthrough information is not selected during the feature selection, and the ignorance of the conditional independence assumption of the Naive Bayes classifier in this particular experiment should be further verified. [Conclusions] The queries＇ basic features and content features, by themselves, can well distinguish broad, medium, and specific queries.

关键词：查询专指度决策树 SVM 朴素贝叶斯

分类号：G252.7[文化科学—图书馆学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

查询专指度特征分析与自动识别被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

查询专指度特征分析与自动识别 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

查询专指度特征分析与自动识别被引量：6