基于邻域离散度的异常点检测算法被引量：21

Outlier Detection Algorithm Based on Dispersion of Neighbors

作　　者：沈琰辉刘华文[1] 徐晓丹[1] 赵建民[1] 陈中育[1] SHEN Yanhui;LIU Huawen;XU Xiaodan;ZHAO Jianmin;CHEN Zhongyu(College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua, Zhejiang 321004, China)

机构地区：[1]浙江师范大学数理与信息工程学院,浙江金华321004

出　　处：《计算机科学与探索》2016年第12期1763-1772,共10页Journal of Frontiers of Computer Science and Technology

基　　金：国家自然科学基金Nos.61272007;61272468;61572443;浙江省自然科学基金No.LY14F020012;浙江省教育厅项目No.Y201328291~~

摘　　要：异常点检测在机器学习和数据挖掘领域中有着十分重要的作用。当前异常点检测算法的一大缺陷是正常数据在边缘处异常度较高,导致在某些情况下误判异常点。为了解决该问题,提出了一种新的基于邻域离散度的异常点检测算法。该算法将数据点所在邻域的离散度作为该数据点的异常度,既能有效避免边缘数据点的异常度过高,又能较好地区分正常点与异常点。实验结果表明,该算法能够有效地检测数据中的异常点,并且算法对参数选择不敏感,性能较为稳定。Outlier detection is an important task of machine learning and data mining. A major limitation of the existing outlier detection methods is that the outlierness of border points may be very high, leading to yield misleading results in some situations. To cope with this problem, this paper proposes a novel outlier detection algorithm based on the dispersion of neighbors. The proposed algorithm adopts the dispersion of a data point??s neighbors as its outlier degree,thus the outlierness of border points will not be very high while the normal data and outliers can still be well distinguished.The experimental results show the proposed algorithm is more effective in detecting outliers, less sensitive to parameter settings and is stable in terms of performance.

关键词：异常点检测机器学习数据挖掘主成分分析

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于邻域离散度的异常点检测算法被引量：21

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于邻域离散度的异常点检测算法 被引量：21

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于邻域离散度的异常点检测算法被引量：21