一种稳定的并行分布式频繁集挖掘算法及其应用

A STABLE PARALLEL DISTRIBUTED FREQUENT ITEMSET MINING ALGORITHM AND ITS APPLICATION

机构地区：[1]浙江大学计算机科学与技术学院,浙江杭州310027 [2]中国中医科学院中医药信息研究所,北京100700

出　　处：《计算机应用与软件》2011年第3期83-85,124,共4页Computer Applications and Software

基　　金：国家高技术研究发展计划项目(2006AA01A123);杰出青年基金(NSFC60525202)

摘　　要：为解决大规模医药数据分析中的频繁集挖掘问题,提出一种稳定且具有良好扩展性的并行分布式算法P-FIM。该算法将挖掘任务分割成无相互依赖关系的同构子任务,实现有效的并行计算;并且充分利用Map/Reduce框架和集群环境的优势提高自身的鲁棒性和负载均衡能力。采用最大规模为512万条记录的中医药方剂数据进行算法性能分析实验,其结果表明,该算法在分布式集群环境中表现稳定,而且随着集群规模的增加其加速比接近线性。以P-FIM算法为基础设计实现的中医药数据相关性分析方案,可有效地从大规模临床数据中获得全面、可靠的病、症、药间相关性的信息。This paper proposes P-FIM,a stable parallel distributed algorithm with good scalability,to deal with frequent itemset mining issue in large scale medicine data analysis.It divides the mining task into independent isomorphic subtasks to achieve effective parallel computation,and takes full advantage of Map/Reduce infrastructure as well as computing cluster to improve its own robustness and load balance capability.In this paper we carry out analytical experiment on performance of the P-FIM algorithm based on TCM prescription data that contain largest records up to 51.2million.The result shows that the algorithm performs stably in distributed clustering condition,and approaches linear speedup along with the augment of clustering scale.The correlation analysis scheme of traditional Chinese medicine designed and implemented based on P-FIM algorithm can effectively gain comprehensive and reliable information correlating with the disease,symptoms and medicine from large scale clinical data.

关键词：数据挖掘频繁集挖掘 Map/Reduce并行框架医药数据分析

分类号：TP311.13[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种稳定的并行分布式频繁集挖掘算法及其应用

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种稳定的并行分布式频繁集挖掘算法及其应用

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索