基于目录树的网络科技资源采集算法被引量：3

Crawler Algorithm Based on Directory Tree in Network Science and Technology Resource

出　　处：《计算机工程》2009年第1期277-279,282,共4页Computer Engineering

基　　金：国家科技基础条件平台建设基金资助项目(2005DKA63904)

摘　　要：针对网络科技领域资源分类方式多样化、数据量大等特点,提出一种基于目录树的采集算法,以领域本体知识库提供的本体知识作为评价依据进行有效目录链接的提取和识别,通过一种改进的链接分析策略获取有效的节点链接并进行采集操作。该算法研究采集体系结构,注重对最新资源获取速度的优化。实验结果证明,该算法可有效提高资源采集速率。Aimming at full consideration of the characteristics of the network technology in a various methods of classification of resources and a large quantity, this paper proposes a kind of crawler algorithm based on directory tree. The algorithm extracts and recognizes the directory links based on domain ontology knowledge as effective evaluation, and links the nodes effectively through a modified strategy of link analysis, eventually carry through collecting operation. The algorithm not only studies in-depth on the crawler architecture, but also pays attention to the speed of access to the latest resources optimization. Experimental results show that the algorithm can effectively achieve the established objectives both in speed and efficiency.

关键词：科技资源信息采集目录树本体

分类号：TP301[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于目录树的网络科技资源采集算法被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于目录树的网络科技资源采集算法 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于目录树的网络科技资源采集算法被引量：3