一种基于层次聚类的全局孤立点识别方法  被引量:4

Global outlier detection based on hierarchical clustering

在线阅读下载全文

作  者:梁斌梅[1,2] 韦琳娜[1] 宋庆祯[1] 

机构地区:[1]广西大学数学与信息科学学院,南宁530004 [2]四川大学计算机学院,成都610065

出  处:《计算机应用研究》2011年第5期1731-1733,共3页Application Research of Computers

基  金:广西大学科研基金资助项目(XJZ100258)

摘  要:针对现有的孤立点检测算法在通用性、有效性、用户友好性及处理高维大数据集的性能还不完善,提出了一种快速有效的基于层次聚类的全局孤立点检测方法。该方法基于层次聚类的结果,根据聚类树和距离矩阵可视化判断数据孤立程度,并确定孤立点数目。从聚类树自顶向下,无监督地去除孤立点。仿真实验验证了本方法能快速有效识别全局孤立点,具有用户友好性,适用于不同形状的数据集,可用于大型高维数据集的孤立点检测。The existing outlier detection algorithms should be improved due to their versatility,effectiveness,user-friendliness,and the performance in processing high-dimensional and large databases.This paper proposed a fast and effective hierarchical clustering based global outlier detection approch.Agglomerative hierarchical clustering was performed firstly,and then the isolated degree of the data could be visually judged and the number of the outliers could be determined based on the clustering tree and the distance matrix.After that,the outliers was identified unsupervisedly from the top to down of the clustering tree.Experimental results show that,this approach can identify global outliers fastly and effectively,and is user-friendly and capable at datasets of various shapes.Experiments also illustrate that this approach is suitable for using on high-dimensional and large databases.

关 键 词:孤立点检测 层次聚类 数据挖掘 全局孤立点 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象