基于区间数的不确定数据流2k近邻聚类算法被引量：8

The clustering algorithm of uncertain data stream 2k-near neighbors based on interval number

作　　者：陆亿红[1] 任胜亮 LU Yihong;REN Shengliang(College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China)

机构地区：[1]浙江工业大学计算机科学与技术学院,浙江杭州310023

出　　处：《浙江工业大学学报》2018年第3期321-326,共6页Journal of Zhejiang University of Technology

基　　金：水利部公益性行业科研专项(201401044)

摘　　要：现有数据流聚类算法多数面向的是确定性数据,可是不确定数据的数据流聚类逐步被受到关注,因为经典的不确定数据聚类算法具有概率密度函数获取困难、实用性不强以及计算复杂等缺点,提出一种基于区间数的不确定数据流聚类算法UIDStream.算法用区间数来表示属性不确定性数据和基于区间数的距离计算方法,定义了不确定性数据之间的相似度,并利用传统k近邻聚类的思想,提出基于区间数的2k近邻微簇和最优2k近邻微簇的概念,通过最优2k近邻微簇的融合,实现不确定数据流的聚类.实验结果表明:改进后的算法具有良好的聚类效果,提高了不确定数据流聚类的聚类质量和速率.Existing data stream clustering algorithms are most focus at deterministic data,but the data stream clustering algorithms of uncertain data are gradually receiving attention.The classical clustering algorithm of uncertain data has some shortcomings such as difficult to obtain probability density function,poor practicability and complex computation.In this paper,clustering algorithm based on interval number for uncertain data stream(UIDStream)is proposed.The proposed algorithm uses interval number to represent the attribute uncertain data and the distance based on interval number.The distance calculation method based on interval number is defined and the calculation method of similarity between the uncertain data is proposed.Based on traditionalk-near neighbors clustering thinking,the concepts of 2 k-near neighbors micro cluster and optimal 2 k-nearest neighbors micro cluster are proposed.Clustering of uncertain data streams is achieved through the fusion of the 2 k-near neighbors micro cluster.The improved algorithm has a good clustering effect and improves the clustering quality and rate of the uncertain data stream clustering.

关键词：不确定数据区间数数据流聚类数据挖掘

分类号：TP3[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于区间数的不确定数据流2k近邻聚类算法被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于区间数的不确定数据流2k近邻聚类算法 被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于区间数的不确定数据流2k近邻聚类算法被引量：8