基于局部密度的快速离群点检测算法  被引量:26

Fast outlier detection algorithm based on local density

在线阅读下载全文

作  者:邹云峰 张昕 宋世渊 倪巍伟[2] 

机构地区:[1]国网江苏省电力公司电力科学研究院,南京210036 [2]东南大学计算机科学与工程学院,南京211189

出  处:《计算机应用》2017年第10期2932-2937,共6页journal of Computer Applications

基  金:国家自然科学基金资助项目(61370077)~~

摘  要:已有的密度离群点检测算法LOF不能适应数据分布异常情况离群点检测,INFLO算法虽引入反向k近邻点集有效地解决了数据分布异常情况的离群点检测问题,但存在需要对所有数据点不加区分地分析其k近邻和反向k近邻点集导致的效率降低问题。针对该问题,提出局部密度离群点检测算法——LDBO,引入强k近邻点和弱k近邻点概念,通过分析邻近数据点的离群相关性,对数据点区别对待;并提出数据点离群性预判断策略,尽可能避免不必要的反向k近邻分析,有效提高数据分布异常情况离群点检测算法的效率。理论分析和实验结果表明,LDBO算法效率优于INFLO,算法是有效可行的。Mining outliers is to find exceptional objects that deviate from the most rest of the data set. Outlier detection based on density has attracted lots of attention, but the density-based algorithm named Local Outlier Factor (LOF) is not suitable for the data set with abnormal distribution, and the algorithm named INFLuenced Outlierness (INFLO) solves this problem by analyzing both k nearest neighbors and reverse k nearest neighbors of each data point at cost of inferior efficiency. To solve this problem, a local density-based algorithm named Local Density Based Outlier detection (LDBO) was proposed, which can improve outlier detection efficiency and effectiveness simultaneously. LDBO introduced definitions of strong k nearest neighbors and weak k nearest neighbors to realize outlier relation analysis of those data points located nearby. Furthermore, to improve the outlier detection efficiency, prejudgement was applied to avoid unnecessary reverse k nearest neighbor analysis as far as possible. Theoretical analysis and experimental results Indicate that LDBO outperforms INFLO in efficiency, and it is effective and feasible.

关 键 词:离群点检测 局部密度 强k近邻点 弱k近邻点 反向k近邻点集 

分 类 号:TP274[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象