基于移动爬虫的专用Web信息收集系统的设计被引量：3

Design of a Specific Web Information-Collecting System Based on Mo bile Crawler

出　　处：《计算机工程与应用》2003年第36期153-156,共4页Computer Engineering and Applications

基　　金：国家自然科学基金资助(编号:60073030);国家教育部"现代远程教育关键技术研究重点项目"资助;富士通研究项目资助

摘　　要：搜索引擎已经成为网上导航的重要工具。为了能够提供强大的搜索能力,搜索引擎对网上可访问文档维持着详尽的索引。创建和维护索引的任务由网络爬虫完成,网络爬虫代表搜索引擎递归地遍历和下载Web页面。Web页面在下载之后,被搜索引擎分析、建索引,然后提供检索服务。文章介绍了一种更加有效的建立Web索引的方法,该方法是基于移动爬虫(MobileCrawler)的。在此提出的爬虫首先被传送到数据所在的站点,在那里任何不需要的数据在传回搜索引擎之前在当地被过滤。这个方法尤其适用于实施所谓的“智能”爬行算法,这些算法根据已访问过的Web页面的内容来决定一条有效的爬行路径。移动爬虫是移动计算和专业搜索引擎两大技术趋势的结合,能够从技术上很好地解决现在通用搜索引擎所面临的问题。Search engines have become important tools for Web navigation.In order to provide powerful search facili-ties,search engines maintain comprehensive indices of documents available on the Web.The creation and maintenance of Web indices is done by Web crawlers,which recursively traverse and download Web pages on behalf of search engines.Analysis of the collected information is performed after the data has been downloaded.This paper presents an alterna-tive,more efficient approach to building Web indices based on mobile crawlers.The proposed crawlers are transferred to the source(s)where the data resides in order to filter out any unwanted data locally before transferring it back to the search engine.Our approach to Web crawling is particularly well suited for implementing so-called″smart″crawling al-gorithms which determine an efficient crawling path based on the contents of Web pages that have been visited so far.Mobile crawler is the result of the two technology tendencies,specific search engine and mobile computing,it promises to solve the difficult issues faced by current general search engines.

关键词：互联网搜索引擎 WEB 信息收集系统设计移动爬虫

分类号：TP393[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于移动爬虫的专用Web信息收集系统的设计被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于移动爬虫的专用Web信息收集系统的设计 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于移动爬虫的专用Web信息收集系统的设计被引量：3