检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中国科学院研究生院
出 处:《电子技术(上海)》2012年第7期1-4,共4页Electronic Technology
摘 要:文章在分析关联规则和Apriori算法原理的基础上,针对Apriori算法扫描数据库时由于事务数过大,导致系统的I/O负载和CPU运算压力过大等弊端,提出一种主要针对大数据量情况下Apriori算法性能提升的改进算法。主要思想是通过抽样和事务压缩来减少算法需要扫描的事务数,进而提升算法的效率。同时,基于主流的weka开源数据挖掘工具实现了改进算法。实验结果表明了算法的有效性。On the basis of analyzing association rules and the principle of Apriori algorithm, and aiming at the drawbacks that when the Apriori algorithm scan database, the system's I/O load and the CPU computing pressure are excessive due to the too large amount of transactions, this paper proposes an improved algorithm for Apriori algorithm to enhance the performance of the algorithm in the case of large data amount. The main idea is to enhance the algorithm's performance which uses sampling and transaction compression to reduce the transaction number to be scanned, and thus enhance the efficiency of the algorithm. Meanwhile, this paper achieves the improved algorithm based on Weka, a mainstream open source data mining tools. The experimental results show the effectiveness of the algorithm.
关 键 词:数据挖掘 关联规则 APRIORI 事务压缩 抽样 WEKA
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.95