基于关联函数的数据流聚类算法被引量：5

Data stream clustering algorithm based on dependent function

出　　处：《计算机应用》2013年第1期202-206,共5页journal of Computer Applications

基　　金：甘肃省科技支援计划项目(090GKCA075);2012年度教育部人文社会科学研究项目(12YJCZH282)

摘　　要：传统数据流聚类算法大多基于距离或密度,聚类质量和处理效率都不高。针对以上问题,提出了一种基于关联函数的数据流聚类算法。首先,将数据点以物元的形式模型化,建立解决问题所需要的关联函数;其次,计算关联函数的值,以此值的大小来判断数据点属于某簇的程度;然后,将所提方法运用到数据流聚类的在线离线框架中;最后,采用真实数据集KDD-CUP99和随机生成的人工数据集进行算法的测试。实验结果表明,所提方法的聚类纯度在92%以上,每秒能处理约6300条记录,与传统算法相比,处理效率有了较大的提高,在维度和簇数目方面的可扩展性较强,适用于处理大规模的动态数据集。The traditional data stream clustering algorithms are mostly based on distance or density, so their clustering quality and processing efficiency are weak. To address the above problems, this paper proposed a data stream clustering algorithm based on dependent function. Firstly, the data points were modeled in the form of matter-element and dependent function was established to solve the problem. Secondly, the value of the dependent function was calculated. According to this value, the degree that data point belongs to a certain cluster was judged. Then, the proposed method was applied to online- offline framework of the data stream clustering. Finally, the proposed algorithm was tested by using the real data set KDD- CUP99 and randomly generated artificial data sets. The experimental results show that clustering purity of the proposed method is over 92%, and it can deal with about 6 300 records per second. Compared with the traditional algorithm, the processing efficiency of the algorithm is greatly improved. In the aspects of dimension and the number of cluster, the algorithm shows stronger scalability, and it is suitable for processing large dynamic data set.

关键词：数据流聚类物元关联函数经典域节域

分类号：TP311.5[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于关联函数的数据流聚类算法被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于关联函数的数据流聚类算法 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于关联函数的数据流聚类算法被引量：5