基于BP神经网络的主题爬虫研究  

Research on theme Crawler Based on BP Neural Network

在线阅读下载全文

作  者:黄利斌 陈慧 HUANG Li-bin;CHEN Hui(School of Information Science and Technology, Hunan Agricultural University, Changsha 410128,China)

机构地区:[1]湖南广播电视大学,湖南长沙410004

出  处:《电脑知识与技术》2019年第2期160-162,共3页Computer Knowledge and Technology

摘  要:主题爬虫已经成为当下信息采集的重要方式。传统的主题爬虫技术,主题词与其相关性权重是固定不变的,因此,存在随着爬取页面的增加而爬准率下降,错误率上升的问题。本文采用的主题爬虫技术,运用BP神经网络,根据下载网页的特征,动态更新主题词与其相关性权重,从而实现随着爬取页面的增加而爬准率上升,错误率下降。基于BP神经网络的主题爬虫技术,能提高信息采集的效率,降低因采集错误而产生的损失。Theme crawler has been an important way of obtaining modern information. For traditional theme crawler technology, the theme words and its relevance weights are fixed, which is a problem that the crawl rate decreases and the error rate increases as the number of crawling pages increases. Therefore, we propose a theme crawler technology based on BP neural network, which can dy. namically update keywords and their relevance weights according to the characteristics of the downloaded webpage.Intelligent the. matic crawler technology based on BP neural network can improve the efficiency of information collection and reduce the loss caused by the acquisition error.

关 键 词:主题爬虫 BP神经网络 信息采集 主题词表 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象