一种基于关键域子空间的离群数据聚类算法被引量：8

An Algorithm for Clustering of Outliers Based on Key Attribute Subspace

出　　处：《计算机研究与发展》2007年第4期651-659,共9页Journal of Computer Research and Development

基　　金：国家自然科学基金项目(60403009);重庆市自然科学基金项目(2005BB2224)

摘　　要：离群数据发现与分析是数据挖掘的重要组成部分,现有离群数据挖掘算法主要针对如何检测离群对象,缺乏对挖掘出的离群数据集进行解释与分析的有效方法.通过对离群数据来源及特性进行分析并结合粗糙集理论,定义了离群划分相似度的概念,提出了一种基于关键属性域子空间的离群数据聚类算法COKAS,该算法不仅揭示了离群数据子空间特性,进一步获取了扩展知识,而且有助于对整体数据集的理解.对两个多维数据集的实验结果表明,该算法具有良好的适应性及有效性.It is an important part of data mining to discover and analyze outlying observations. Outliers may contain crucial information, and so detecting them is much more significant than detecting general patterns in some applications which include, for instance, credit card fraud in finance, calling fraud in telecommunication, intrusion in network, disease diagnosis, etc. Existing outlier mining algorithms focus on detecting and identifying outliers, but studies of outliers include both mining outliers and analyzing why they are exceptional. The research on explaining and analyzing outliers slightly lags behind outlier mining technology now. It is inevitable that analyzing outliers to the full needs a great deal of knowledge from object task fields. However, some further discoveries of outliers may be obtained from studies of distributing characteristics of dataset in attribute space. By analyzing the origin and feature of outliers and using the theory of rough set, a concept of outlying partition similarity is defined and then an algorithm for clustering outliers based on key attribute subspace （COKAS） is proposed. The approach can provide the extended knowledge of identified outliers and improve the understanding of the whole data set. Experimental results of real multi-dimension data set show that this algorithm is scalable and efficient.

关键词：离群集离群划分相似度关键域子空间聚类

分类号：TP311[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于关键域子空间的离群数据聚类算法被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于关键域子空间的离群数据聚类算法 被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种基于关键域子空间的离群数据聚类算法被引量：8