DMGrid:基于网格计算的数据挖掘系统(英文)  被引量:2

DMGrid:A Data Mining System Based on Grid Computing

在线阅读下载全文

作  者:王翼[1] 徐六通[1] 杨胜琦[1] 

机构地区:[1]北京邮电大学智能通信软件与多媒体北京市重点实验室,北京100876

出  处:《计算机科学与探索》2010年第2期180-190,共11页Journal of Frontiers of Computer Science and Technology

基  金:The National Natural Science Foundation of China under Grant No.60402011;the National Eleven Five-Year Scientific and Technical Support Plans of China under Grant No.2006BAH03B05~~

摘  要:数据挖掘工作面临一个问题:由于数据挖掘任务需要处理大规模数据,导致任务执行时间过长。网格计算的研究目标就是将分散的、异构的、闲置的计算机结合为一个高性能的计算机系统,因此可以利用网格系统提供的高性能计算能力来有效降低数据处理时间。提出并实现基于网格计算的数据挖掘系统——DMGrid。重点考虑了并行计算功能,同时考虑了网格计算资源的动态配置。和现存的数据挖掘网格不同的是,DMGrid提供了一个引擎来执行应用中设定的工作流,同时还提供了应用运行监控功能。最后在实验中通过设计两个应用程序(客户流失分析和客户价值分析),证明了DMGrid的可行性。The field of data mining now confronts a common problem that data mining tasks are time-consuming in that these tasks have to process large-scale datasets. Grid computing focuses on integrating distributed, heterogeneous and idle computers from the Internet to be a service system with high performance. Thus, it is possible to take advantage of grid computing to provide high performance computation capability to effectively reduce task durations. Here, DMGrid, a grid handling data mining applications, has been successfully developed. In DMGrid, it not only considers efficient parallel computing as a crucial aspect, but also takes into account dynamic resource configuration. Unlike many existing data mining grids, DMGrid also provides an engine to execute the algorithm flow specified in an application. Moreover, it offers application of execution monitoring. At last, the feasibility of DMGrid is validated by performing experiments, and two applications are designed: Customer churning analysis and customer value analysis.

关 键 词:网格计算 数据挖掘 动态配置 工作流 运行监控 

分 类 号:TP393.09[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象