检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]哈尔滨工业大学计算机科学与技术学院
出 处:《哈尔滨商业大学学报(自然科学版)》2006年第1期84-87,共4页Journal of Harbin University of Commerce:Natural Sciences Edition
摘 要:Google采用了并行,索引桶,数据压缩,PageRank算法等的技术,建立了复杂的体系结构,包括网络爬行机器人crawler、知识库Repository、索引系统(包括索引器indexer,桶barrels,文件索引等)、排序器Sorter和搜索器Searcher五个部分.Google的rank系统综合了词频,类型,相邻度,网页重要性等因素.其中最值得一提的是计算网页重要性的PageRank算法,它把文献检索的引用理论应用到Web中,即一个网页有很多网页指向它,或者一些重要的网页指向它,则这个网页很重要.PageRank算法大大提高了检索效率.It is hard to retrieve information on the Internet, but search engine make it easy. The data on the Intemet is so large that the retrieve information technology on the normal database can not meet the requirement. To resolve the problem, some technologies, such as parallel processing, barrel sorting, compression and PageRank, are applied to Google. So it is a complicated system which have five parts, crawler, Repository, index system(including indexer, barrels, file index and so on), sorter, searcher. The rank system of Google considers both count-weight, type weight, prox-weight, and PageRank which weight the importance of a page. Applied Academic citation literature to the Web, a page can have a high PageRank if there arc many pages that point to it, or if there arc some pages that point to it and have a high PageRank. Applying the PageRank, the search technology is improved effectively.
关 键 词:搜索引擎 PAGERANK GOOGLE 网络爬行机器人 排序
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.40