大数据环境下最小单调约束闭包Hadoop并行关联规则  被引量:2

CMSC-HPAR:the closure minimal single constraint based Hadoop paralle association rules algorithm for large data environment

在线阅读下载全文

作  者:李春青[1] 李海生[1] 梁婷婷[1] 赵凯[2] 

机构地区:[1]广西民族师范学院数学与计算机科学系,广西崇左532200 [2]平顶山学院教务处,河南平顶山467000

出  处:《中国科技论文》2015年第20期2356-2361,共6页China Sciencepaper

基  金:广西高校科学技术研究项目(YB2014417);河南省科技计划项目(142102210225)

摘  要:针对传统关联规则算法存在较大规则冗余问题,提出基于最小单调约束闭包Hadoop并行化关联规则。首先,基于闭包算子约束规则等价关系集,给出了满足最小单调约束规则集,可有效地将约束规则集划分为不相交的等价规则类,降低冗余规则比率;其次针对大数据问题,采用Hadoop框架下Mapreduce并行计算模型,实现最小单调约束闭包关联规则的并行化计算,有效地提升算法对于大数据处理的可拓展性;最后通过在标准测试集上的实验对比,显示了所提算法的有效性。The closure minimal single constraint based Hadoop paralle association rules algorithm for large data environment is designed for the problem of large redundant rules in traditional association rule algorithm.Firstly,the smallest single constraint rules set is given according to the equivalence relations with closure operator constraint rules.It could efficiently divide the constraint rules into disjoint equivalence rule class,and reduce redundant rules ratio.Secondly,Hadoop MapReduce parallel computing model is applied to achieve the smallest enclosing a single constraint association rules parallel computing in big data,which effectively improve the algorithm expandation for large data processing.Finally,the effectiveness of the proposed algorithm is demonstrated by comparing the experimental results on the standard test set.

关 键 词:大数据 闭包算子 最小单调约束 Hadoop框架 关联规则 Mapreduce并行计算 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象