网页排序中的随机模型及算法  被引量:2

Research on stochastic models and algorithms for web page ranking

在线阅读下载全文

作  者:刘玉婷[1] 马志明[2] 

机构地区:[1]北京交通大学理学院数学系,北京100044 [2]中国科学院数学与系统科学研究院

出  处:《中国科学:数学》2011年第12期1095-1103,共9页Scientia Sinica:Mathematica

基  金:国家自然科学基金(批准号:11001010)资助项目

摘  要:随着互联网规模的日益增长,搜索引擎已经成为互联网上有效的信息获取工具.而在众多搜索引擎的背后,是信息检索技术,也即网页排序算法在起作用.网页排序包括重要性排序和相关性排序.通过我们研究发现,尽管这两类排序所依据的准则不同,但是都可以通过建立适当的随机过程模型来研究.对于网页重要性排序,我们通过分析用户浏览网页的行为建立了Markov骨架过程的框架.基于该框架我们分析了三种不同的随机过程模型对用户行为模拟的合理程度,并设计了名为BrowseRank的一组新算法,该算法可以根据用户上网行为来计算网页的重要性.在网页相关性排序中,我们主要针对排序结果联合问题建立了一个基于Markov链的监督学习框架.通过将传统方法的监督化,使原来难于解决的问题变的易于学习,将原来的NP-难问题转化为一个半正定规划问题,提高了效率.As the World Wide Web grows rapidly, search engines have become the most popular tools to access the large volume of information from it. And the key factor of the search engine is the ranking model of web pages, which contains two types: static rank and dynamic rank. In past, different approaches have been designed to solve these two problems separately. In this thesis, we analyze them on the same base—stochastic process model, and design new algorithms to solve them effciently and effectively. Firstly, we establish a framework on Markov skeleton process to compute the page importance by investigating real browsing behaviors of users. Within this framework, we design a group of eight new novel algorithms all referred to as BrowseRank, to compute the page importance based on the continuous-time time-homogeneous Markov process, which is one of three special cases of the Markov skeleton process. And from the experimental results, we flnd BrowseRank outperforms other baseline algorithms, such as PageRank and TrustRank. Secondly, we build a supervised framework for rank aggregation based on Markov chain. Within this framework, we not only generalize some unsupervised algorithms to supervised ones, but also design a new approach referred to as Supervised MC2 for rank aggregation, which transform the original NP-hard problem to a semi-deffned programme.

关 键 词:信息检索 排序联合问题 MARKOV骨架过程 BrowseRank算法 

分 类 号:O211.6[理学—概率论与数理统计] TP393.092[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象