检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:顾军华 武君艳[2] 许馨匀 谢志坚 张素琪 GU Junhua;WU Junyan;XU Xinyun;XIE Zhijian;ZHANG Suqi(School of Artificial Intelligence and Data Science,Hebei University of Technology,Tianjin 300401,China;Hebei Province Key Laboratory of Big Data Computing(Hebei University of Technology),Tianjin 300401,China;School of Information Engineering,Tianjin University of Commerce,Tianjin 300134,China)
机构地区:[1]河北工业大学人工智能与数据科学学院,天津300401 [2]河北省大数据计算重点实验室(河北工业大学),天津300401 [3]天津商业大学信息工程学院,天津300134
出 处:《计算机应用》2018年第11期3069-3074,共6页journal of Computer Applications
基 金:河北省科技计划项目(17210305D);天津市科技计划项目(16ZXHLSF0023);天津市科技计划项目(15ZXHLGX00130);天津市自然科学基金资助项目(15JCQNJC00600)~~
摘 要:为了进一步提高在Spark平台上的频繁模式增长(FP-Growth)算法执行效率,提出一种新的基于Spark的并行FP-Growth算法——BFPG。首先,从频繁模式树(FP-Tree)规模大小和分区计算量对F-List分组策略进行改进,保证每个分区负载总和近似相等;然后,通过创建列表P-List对数据集划分策略进行优化,减少遍历次数,降低时间复杂度。实验结果表明,BFPG算法提高了并行FP-Growth算法挖掘效率,且算法具有良好的扩展性。In order to further improve the execution efficiency of Frequent Pattern-Growth(FP-Growth)algorithm on Spark platform,a new parallel FP-Growth algorithm based on Spark,named BFPG(Better Frequent Pattern-Growth),was presented.Firstly,the grouping strategy F-List was improved in the size of the Frequent Pattern-Tree(FP-Tree)and the amount of partition calculation to ensure that the load sum of each partition was approximately equal.Secondly,the data set partitioning strategy was optimized by creating a list P-List,and then the time complexity was reduced by reducing the traversal times.The experimental results show that the BFPG algorithm improves the mining efficiency of the parallel FP-Growth algorithm,and the algorithm has good scalability.
关 键 词:大数据平台 关联规则 频繁项集 频繁模式增长算法 SPARK
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229