一种综合语义和时效性意图的检索结果多样化方法  被引量:7

Search Result Diversification Combing Semantic and Temporal Intent

在线阅读下载全文

作  者:任鹏杰[1] 陈竹敏[1] 马军[1] 隋雪芹[1] 吴凯[1] 

机构地区:[1]山东大学计算机科学与技术学院,济南250101

出  处:《计算机学报》2015年第10期2076-2091,共16页Chinese Journal of Computers

基  金:国家自然科学基金(61272240;61103151;61173068);教育部博士点基金(20110131110028);山东省自然科学基金(ZR2012FM037);山东省优秀中青年科学家科研奖励基金(BS2012DX017)资助~~

摘  要:当前,检索结果多样化作为一种提升用户满意度的有效方法已成为Web和数据库检索、文本摘要及推荐系统等领域的研究热点之一.但已有研究工作大都只考虑语义多样化策略.而实际上,多样化是一个非常复杂的优化问题,还需考虑许多其他的策略,如新颖性、质量、价值等.众所周知,Web是一个动态的信息空间,用户的查询需求也随时间不断演化,只有在一个特定的时间模式下,检索系统才能返回满意的结果.故该文提出一种新的结合语义和时效性两个维度的查询结果多样化方法.该文首先给出了多维度查询结果多样化框架的通用定义.然后,对于给定的查询,探讨了如何基于文档、词和查询频率来计算其时效性意图的概率分布.之后,提出一种新的针对时效性多样化的评价方法.最后,构建了针对多维度多样化问题的真实数据集,并通过实验证明该文提出的方法,不管是在传统的多样化评价指标上,还是在该文提出的时效性多样化指标上,性能都超过了当前主流的基准方法.Result diversification has recently been an active research area aimed at improving user satisfaction in Web and database search,text summarization,as well as recommendation system.To the best of our knowledge,almost all existing work only takes semantic strategies into account.However,result diversity is a very complex optimization problem and there may be many other strategies to be considered,such as,freshness,quality,value and so on.Additionally,it is well known that the Web is a dynamic information space and many queries could only be answered accurately under a specific temporal pattern.In this paper we propose a novel multidimensional diversification framework which combines the temporal space and the semantic space together to generate diversified search results.Firstly,we give a formal definition of our multidimensional diversification framework.Then,we study how to compute the probability distribution of temporal intents directly based on document,word and query frequency data.And then,we present a new evaluation measure especially for temporal diversification.Finally,we construct a real-world dataset for multidimensional diversity problem.The experiments demonstrate that our method can outperform these baseline approaches significantly in terms of both popular diversified measures and a new measure proposed in this paper.

关 键 词:多维度多样化 时效性意图 子主题 语义 时间 社交网络 社会计算 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象