检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:卫冰洁[1,2] 王斌[3] 张帅[1] 李鹏[3]
机构地区:[1]中国科学院计算技术研究所,北京100190 [2]国家计算机网络应急技术处理协调中心,北京100029 [3]中国科学院信息工程研究所,北京100093
出 处:《中文信息学报》2015年第2期10-23,共14页Journal of Chinese Information Processing
基 金:科技支撑计划(2012BAH46B02)
摘 要:随着微博的快速发展,微博检索已经成为近年来研究领域的热点之一。该文首先以TREC Microblog数据为基础,从分析微博文档和微博查询两方面出发,得出微博检索与传统文本检索之间的两点不同:一是微博文档相较于网页具有很多独有的特征;二是微博查询属于时间敏感查询,即在排序时除了考虑文本的语义相似度,还需要考虑时间因素,将这类方法统称为时间感知的检索技术。这两点差异使得已有的信息检索技术不能满足微博搜索的需求。该文主要介绍了近年来这两方面的相关研究:首先描述了微博本身的多种特征以及基于这些特征提出的检索方法;然后以传统信息检索过程为主线,分别介绍了将时间信息用于文本表示、文档先验、查询扩展三方面的排序模型,最后总结了已有工作并且对未来研究内容进行了展望。With the rapid recent years. Firstly, in dataset. We found that, development of microblog, microblog retrieval has this paper, we analyze microblog documents and become one of the hot research areas in queries based on the TREC Microblog in contrast to traditional text retrieval, microblog search significantly differs in two ways One is that microblog has its own characteristics compared to webpage. And the other is that microblog queries are time-sensitive, which means time information should be used in addition to traditional text similarity. According to these two differences, traditional text retrieval methods cannot he directly used in microhlog search. Then, the related work on the two aspects of microblog retrieval is summarized. We described some microblog features and re- trieval methods based on these features. According to the process of information retrieval, search models which use temporal information as the document priori or for query expansion or for text representation are also introduced. At last, we provide the conclusion and discuss the future work.
关 键 词:微博检索 时间信息 微博特性 文本表示 文档先验 查询扩展
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145