基于市场匹配的多Agent智能爬虫系统  

System of Intelligent Crawler Based on Multi-Agent

在线阅读下载全文

作  者:刘佳[1] 杜亚军[1] 

机构地区:[1]西华大学计算机与软件工程学院,四川成都610039

出  处:《西华大学学报(自然科学版)》2016年第1期67-72,共6页Journal of Xihua University:Natural Science Edition

基  金:国家自然科学基金(61271413;61472329)

摘  要:在网络文字、图像视频、音频数量日益增长的网络世界中,网络爬虫爬取结果变得越来越差,主要表现在爬取网页的精确率低、召回率低和重复率高等方面。为解决这些问题,结合市场匹配基本原理和网络爬虫的特点,提出一种基于市场匹配算法的多Agent智能爬虫系统。基于市场匹配算法,设计了多Agent智能爬虫系统,以雅虎一级目录12个主题为测试数据对网络爬虫爬取网页的精确率、召回率和重复率进行了分析。结果表明,与未使用市场匹配算法的系统相比较,基于市场匹配算法的多Agent智能爬虫系统的精确率提高了9%、召回率提高了8%、重复率降低了5%,其爬虫性能有较大改善。With the number of network texts, graphics videos,audios in the online world is growing rapidly, the web crawler be- comes more and more powerless, mainly showed in the lower precise rate, lower recall rate and higher repetition rate while crawling web pages. In order to address the problem mentioned above, a muhi-Agent intelligent crawler system using market-matching algorithm is proposed by combining market-matching fundamentals and characteristics of web crawler. This paper firstly analyzed and designed every important part of multi-Agent intelligent crawler system in detail based on market-matching algorithm. Then the precise rate, re- call rate and repetition rate of crawling web pages were analyzed by using the directory of Yahoo as test data. Experimental results show that the multi-Agent intelligent crawler system can improve the performance of the web crawlers compared to the system without using market-matching algorithm, specifically manifest in the precision rate and recall rate increased by 9% ,8% respectively, while its repe- tition rate decreased by 5%.

关 键 词:市场匹配算法 多AGENT 智能爬虫 

分 类 号:TP393.09[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象