检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《小型微型计算机系统》2014年第7期1591-1595,共5页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(61103112)资助;国家社会科学基金项目(11CTQ036)资助;北京市哲学社会科学规划基金项目(13SHC031)资助;国家语委十二五规划基金项目(YB125-10)资助
摘 要:查询分类需要建立查询意图的分类知识体系,每个查询类别中的分类知识规模相对比较大,因而不能保证每一个查询类别都能被覆盖.提出基于随机游走方式的查询分类知识挖掘方法,首先抽取维基百科中的全部词条与分类知识形成集合,并采用随机游走方式遍历图中所有概念结点,得到每个结点的概率分布,并将其转化成分类权重,最终构建查询知识链接图.该方法借助维基百科能够解决数据稀疏问题.通过随机游走方式对未直接关联的查询进行相似度计算,提高查询分类的覆盖率.实验证实,该方法能够有效定位用户的查询领域.The traditional approaches to identify user's query intent need large classifiers in early classification to understand the intent behind user's query. There always are some samples not being covered. This paper proposes to mine query classification knowledge by random walk method from Wikipedia. The Wikipedia concepts are used as the intent representation space,each intent domain is represented as a set of Wikipedia articles and categories,the random walk graph system will be built through the architecture of Wikipedia's knowledge,on which the random walk processing is carried out. And a probability that belongs to the intent will is obtained for each concept. Then the finial prediction on query intent is presented. It solve the data sparseness problem by introducing Wikipedia as the external knowledge and build the indirect connections among concepts and classifications by the random walk. Finally results show the method is provides an effective solution to query intent classification.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.154.119