微博检索的研究进展  被引量:2

A Survey of Microblog Search

在线阅读下载全文

作  者:卫冰洁[1,2] 王斌[3] 张帅[1] 李鹏[3] 

机构地区:[1]中国科学院计算技术研究所,北京100190 [2]国家计算机网络应急技术处理协调中心,北京100029 [3]中国科学院信息工程研究所,北京100093

出  处:《中文信息学报》2015年第2期10-23,共14页Journal of Chinese Information Processing

基  金:科技支撑计划(2012BAH46B02)

摘  要:随着微博的快速发展,微博检索已经成为近年来研究领域的热点之一。该文首先以TREC Microblog数据为基础,从分析微博文档和微博查询两方面出发,得出微博检索与传统文本检索之间的两点不同:一是微博文档相较于网页具有很多独有的特征;二是微博查询属于时间敏感查询,即在排序时除了考虑文本的语义相似度,还需要考虑时间因素,将这类方法统称为时间感知的检索技术。这两点差异使得已有的信息检索技术不能满足微博搜索的需求。该文主要介绍了近年来这两方面的相关研究:首先描述了微博本身的多种特征以及基于这些特征提出的检索方法;然后以传统信息检索过程为主线,分别介绍了将时间信息用于文本表示、文档先验、查询扩展三方面的排序模型,最后总结了已有工作并且对未来研究内容进行了展望。With the rapid recent years. Firstly, in dataset. We found that, development of microblog, microblog retrieval has this paper, we analyze microblog documents and become one of the hot research areas in queries based on the TREC Microblog in contrast to traditional text retrieval, microblog search significantly differs in two ways One is that microblog has its own characteristics compared to webpage. And the other is that microblog queries are time-sensitive, which means time information should be used in addition to traditional text similarity. According to these two differences, traditional text retrieval methods cannot he directly used in microhlog search. Then, the related work on the two aspects of microblog retrieval is summarized. We described some microblog features and re- trieval methods based on these features. According to the process of information retrieval, search models which use temporal information as the document priori or for query expansion or for text representation are also introduced. At last, we provide the conclusion and discuss the future work.

关 键 词:微博检索 时间信息 微博特性 文本表示 文档先验 查询扩展 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象