检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《计算机与现代化》2015年第8期116-120,共5页Computer and Modernization
摘 要:分布式lazy关联分类算法(DLAC算法)指应用分布式关联规则挖掘算法的lazy关联分类算法。现有的DLAC算法存在2个主要问题:一是对多个待分类样本进行分类时效率低下;二是投影操作未分布式实现。针对上述2个问题,提出一种改进型的分布式lazy关联分类(PDLAC)算法。首先,对待分类样本进行KMeans聚类;其次,判断类内的待分类样本是否满足聚合条件,满足进行聚合,不满足则类内的每个待分类样本单独成为一类;然后,进行分布式投影并使用CDMA算法挖掘关联规则;最后,构建分类器对类内的一个或多个待分类样本进行分类。设置并行度为15进行实验:PDLAC算法所用的时间远小于DLAC算法,并且随着待分类样本数目的增加,性能提升越大。实验结果表明,PDLAC算法是解决上述2个问题的一个好方法。Distributed lazy associative classification algorithm (DLAC) refers to a lazy associative classification algorithm using distributed association rules mining. The existing DLAC algorithm has two main problems: one is the inefficiency of classifying multiple test samples ; the other is that projection operation is not distributed. Hence, this paper proposed an improved distributed lazy associative classification algorithm--PDLAC algorithm. Firstly, it clustered the test samples using KMeans method, second- ly, judged whether it satisfied the aggregating condition or not for each clustered test samples, if it satisfied, aggregated the clus- tered test samples, if not, let each of the clustered test samples to be one clustered test sample. Then, it executed distributed projection and mined association rules using C-DMA algorithm. Finally, it constructed classifier to classify one or more test sample at the same time. Experiments were conducted with setting the degree of parallelism to 15. The time consumption of PDLAC algorithm was far less than DLAC algorithm, and its performance was much better as the number of testing samples increased. The test results show that PDLAC algorithm is a good solution to both two problems mentioned above.
关 键 词:聚合方法 分布式投影 分布式关联规则挖掘 lazy方法 关联分类
分 类 号:TP312[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249