Parallel Web Mining System Based on Cloud Platform 被引量：1

Parallel Web Mining System Based on Cloud Platform

作　　者：Shengmei Luo Qing He Lixia Liu Xiang Ao Ning Li Fuzhen Zhuang

机构地区：[1]Pre-Research department of ZTE [2]Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences [3]Graduate University of Chinese Academy of Sciences

出　　处：《ZTE Communications》2012年第4期45-53,共9页中兴通讯技术（英文版）

基　　金：supported by the National Natural Science Foundation of China (No. 61175052,60975039, 61203297, 60933004, 61035003);National High-tech R&D Program of China (863 Program) (No.2012AA011003);supported by the ZTE research found of Parallel Web Mining project

摘　　要：Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load （ETL） and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load （ETL） and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.

关键词：web mining large scale high volume high dimension cloudcomputing

分类号：TP393.09[自动化与计算机技术—计算机应用技术] TP391.1[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Parallel Web Mining System Based on Cloud Platform 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Parallel Web Mining System Based on Cloud Platform 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索