检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:石波 万定生[1] 姚建国 丁伟[1] SHI Bo;WAN Ding-sheng;YAO Jian-guo;DING Wei(School of Computer and Information,Hohai University,Nanjing 210098,China;Bureau of Hydrology,Huaihe River Commission,Bengbu 233001,Anhui Province,China)
机构地区:[1]河海大学计算机与信息学院,南京210098 [2]淮河水利委员会水文局,安徽蚌埠233001
出 处:《信息技术》2018年第6期7-11,16,共6页Information Technology
基 金:国家科技支撑计划课题(2015BAB07B01);公益性行业科研专项(201501022)
摘 要:将传统的关联规则Apriori算法部署到云平台,数据挖掘的效率有很大的提升,但数据类型复杂程度和数据量的不断增加,Apriori算法需频繁扫描事务数据库,产生的频繁项明显增多,耗费大量的系统I/O、内存资源。提出一种基于Map/Reduce编程模型的并行HD-Apriori算法。该算法通过采用0-1矩阵从而达到简化计算并有效减少扫描数据库次数,实现Apriori算法的并行化。实验证明,HD-Apriori算法的挖掘效率优于Apriori算法,且数据规模越大,效果愈显著。The traditional association rules Apriori algorithm is deployed to the cloud platform,the efficiency of data mining has a lot of improvement,but the data types complexity and increasing the amount of data,Apriori algorithm needs to scan the transaction database frequently,the frequent items significantly increased, large quantities of system I/O, memory resources. A parallel HD-Apriori algorithm based on the Map/Reduce programming model is proposed. The algorithm achieves the parallelization of Apriori algorithm by adopting the 0-1 matrix to simplify the computation and reduce the number of scanning database. The results show that the efficiency of the HD-Apriori algorithm is better than the Apriori algorithm,and the larger the data scale is,the more effective it is.
关 键 词:数据挖掘 MAP/REDUCE APRIORI算法 水利普查
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49