HD-Apriori算法在水利普查数据处理中的应用研究  

Application of HD-Apriori algorithm in data processing of water conservancy census

在线阅读下载全文

作  者:石波 万定生[1] 姚建国 丁伟[1] SHI Bo;WAN Ding-sheng;YAO Jian-guo;DING Wei(School of Computer and Information,Hohai University,Nanjing 210098,China;Bureau of Hydrology,Huaihe River Commission,Bengbu 233001,Anhui Province,China)

机构地区:[1]河海大学计算机与信息学院,南京210098 [2]淮河水利委员会水文局,安徽蚌埠233001

出  处:《信息技术》2018年第6期7-11,16,共6页Information Technology

基  金:国家科技支撑计划课题(2015BAB07B01);公益性行业科研专项(201501022)

摘  要:将传统的关联规则Apriori算法部署到云平台,数据挖掘的效率有很大的提升,但数据类型复杂程度和数据量的不断增加,Apriori算法需频繁扫描事务数据库,产生的频繁项明显增多,耗费大量的系统I/O、内存资源。提出一种基于Map/Reduce编程模型的并行HD-Apriori算法。该算法通过采用0-1矩阵从而达到简化计算并有效减少扫描数据库次数,实现Apriori算法的并行化。实验证明,HD-Apriori算法的挖掘效率优于Apriori算法,且数据规模越大,效果愈显著。The traditional association rules Apriori algorithm is deployed to the cloud platform,the efficiency of data mining has a lot of improvement,but the data types complexity and increasing the amount of data,Apriori algorithm needs to scan the transaction database frequently,the frequent items significantly increased, large quantities of system I/O, memory resources. A parallel HD-Apriori algorithm based on the Map/Reduce programming model is proposed. The algorithm achieves the parallelization of Apriori algorithm by adopting the 0-1 matrix to simplify the computation and reduce the number of scanning database. The results show that the efficiency of the HD-Apriori algorithm is better than the Apriori algorithm,and the larger the data scale is,the more effective it is.

关 键 词:数据挖掘 MAP/REDUCE APRIORI算法 水利普查 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象