一种改进型的分布式Lazy关联分类算法  

An Improved Distributed Lazy Associative Classification Algorithm

在线阅读下载全文

作  者:杨浩敏[1] 马超[1] 吴海燕[1] 

机构地区:[1]重庆大学计算机学院,重庆400044

出  处:《计算机与现代化》2015年第8期116-120,共5页Computer and Modernization

摘  要:分布式lazy关联分类算法(DLAC算法)指应用分布式关联规则挖掘算法的lazy关联分类算法。现有的DLAC算法存在2个主要问题:一是对多个待分类样本进行分类时效率低下;二是投影操作未分布式实现。针对上述2个问题,提出一种改进型的分布式lazy关联分类(PDLAC)算法。首先,对待分类样本进行KMeans聚类;其次,判断类内的待分类样本是否满足聚合条件,满足进行聚合,不满足则类内的每个待分类样本单独成为一类;然后,进行分布式投影并使用CDMA算法挖掘关联规则;最后,构建分类器对类内的一个或多个待分类样本进行分类。设置并行度为15进行实验:PDLAC算法所用的时间远小于DLAC算法,并且随着待分类样本数目的增加,性能提升越大。实验结果表明,PDLAC算法是解决上述2个问题的一个好方法。Distributed lazy associative classification algorithm (DLAC) refers to a lazy associative classification algorithm using distributed association rules mining. The existing DLAC algorithm has two main problems: one is the inefficiency of classifying multiple test samples ; the other is that projection operation is not distributed. Hence, this paper proposed an improved distributed lazy associative classification algorithm--PDLAC algorithm. Firstly, it clustered the test samples using KMeans method, second- ly, judged whether it satisfied the aggregating condition or not for each clustered test samples, if it satisfied, aggregated the clus- tered test samples, if not, let each of the clustered test samples to be one clustered test sample. Then, it executed distributed projection and mined association rules using C-DMA algorithm. Finally, it constructed classifier to classify one or more test sample at the same time. Experiments were conducted with setting the degree of parallelism to 15. The time consumption of PDLAC algorithm was far less than DLAC algorithm, and its performance was much better as the number of testing samples increased. The test results show that PDLAC algorithm is a good solution to both two problems mentioned above.

关 键 词:聚合方法 分布式投影 分布式关联规则挖掘 lazy方法 关联分类 

分 类 号:TP312[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象