基于最小超球面密度的孤立点检测算法  

An Outlier Detection Algorithm Based on Minimum Hyper Sphere Density

在线阅读下载全文

作  者:冯宇 苑易伟 FENG Yu;YUAN Yi-wei(School of Electronics and Control Engineering,Chang'an University,Xi'an 710064,China;School of Automation and Information Engineering,Xi'an University of Technology,Xi'an 710048,China)

机构地区:[1]长安大学电子与控制工程学院,陕西西安710064 [2]西安理工大学自动化与信息工程学院,陕西西安710048

出  处:《计算机技术与发展》2019年第6期32-36,共5页Computer Technology and Development

基  金:中央高校基本科研业务费专项资金(300102328103);陕西省自然科学基础研究计划(2017JQ6075)

摘  要:定义了最小超球面密度的概念,提出了一种基于最小超球面密度的孤立点检测算法(minimum hyper sphere density,MHSD)。该算法根据数据的 k 近邻和反 k 近邻获得数据的有效近邻,并使用最小超球面密度和有效近邻计算每个数据的密度背离程度,进而计算每个数据的孤立程度,将孤立程度超过规定阈值的数据视为孤立点。实验数据为一个二维人工数据集和两个高维实际数据集,检测三个数据集的孤立点,对算法性能进行评估,并与经典的局部离群因子算法(local outlier factor,LOF)、离群影响因子算法(influenced outlierness,INFLO)和密度相似邻域离群因子算法(density similarity neighbor based outlier factor,DSNOF)进行比较。实验结果表明,基于最小超球面密度的孤立点检测算法可以准确检测出数据中的孤立点,且性能优于三种经典算法。Minimum hyper sphere density (MHSD) is defined and an outlier detection algorithm based on MHSD is proposed. The effective neighbors are obtained according to k-nearest neighbors and reverse k-nearest neighbors. The density deviation degree of each datum is calculated using minimum hyper sphere density and effective neighbors. Then the isolation degrees can be calculated. Data are regarded as outliers when their isolation degrees are higher than the threshold. A two-dimensional artificial data set and two high-dimensional real data sets are used to evaluate the algorithm performance. The mining results are compared with those of three classical algorithms,which are local outlier factor (LOF),influenced outlierness (INFLO) and density similarity neighbor based outlier factor (DSNOF). The experiment shows that MHSD can find outliers accurately and its performance is better than the three classical algorithms.

关 键 词:孤立点检测 最小超球面 有效近邻 局部密度差 密度背离程度 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象