检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中南财经政法大学信息与安全工程学院,湖北武汉430073
出 处:《计算机工程与科学》2013年第1期82-87,共6页Computer Engineering & Science
摘 要:传统的搜索引擎不能代替用户实行实时监控,为了解决这个问题,提出了定向搜索监控技术,用户可以根据自己的需求定制任务,包括指定搜索范围和搜索主题,系统按用户定义周期监控,并将结果及时主动地反馈给用户。以Google云平台Google App Engine作为开发平台,利用其提供的多项云服务,有效地解决了计划任务管理、多任务触发以及高并发等问题。重写了通用网络爬虫,通过算法改进提出了定向网络爬虫模型,定向网络爬虫与云端强大的服务器相结合,极大地缩短了爬行时间,提高了搜索监控效率。云平台和搜索监控技术的结合是平台即服务思想的一次成功实验。Traditional search engines cannot replaces users to support real-time monitoring. To solve this problem, this paper proposes the initiative directed searching and monitoring technology. Users can customize their own tasks, including search websites and search theme. The system monitors at the us- er-defined period, and the results are returned to the user immediately. The Google App Engine (GAE) is used as the development platform, its several cloud computing services are used to solve the problems such as the planned task management, multitasking and high concurrency. We rewrite the web crawler and propose the directed web crawler. Combining the directed crawler and the cloud server, the crawling time is shorten and the monitoring efficiency is increased. It is a successful experiment on Platform as a Service (PaaS) that combining the cloud platform and the searching and monitoring technology.
关 键 词:Google云平台 定向 搜索 监控 计划任务管理 定向网络爬虫
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.202