检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]国网江苏省电力公司电力科学研究院,南京210036 [2]东南大学计算机科学与工程学院,南京211189
出 处:《计算机应用》2017年第10期2932-2937,共6页journal of Computer Applications
基 金:国家自然科学基金资助项目(61370077)~~
摘 要:已有的密度离群点检测算法LOF不能适应数据分布异常情况离群点检测,INFLO算法虽引入反向k近邻点集有效地解决了数据分布异常情况的离群点检测问题,但存在需要对所有数据点不加区分地分析其k近邻和反向k近邻点集导致的效率降低问题。针对该问题,提出局部密度离群点检测算法——LDBO,引入强k近邻点和弱k近邻点概念,通过分析邻近数据点的离群相关性,对数据点区别对待;并提出数据点离群性预判断策略,尽可能避免不必要的反向k近邻分析,有效提高数据分布异常情况离群点检测算法的效率。理论分析和实验结果表明,LDBO算法效率优于INFLO,算法是有效可行的。Mining outliers is to find exceptional objects that deviate from the most rest of the data set. Outlier detection based on density has attracted lots of attention, but the density-based algorithm named Local Outlier Factor (LOF) is not suitable for the data set with abnormal distribution, and the algorithm named INFLuenced Outlierness (INFLO) solves this problem by analyzing both k nearest neighbors and reverse k nearest neighbors of each data point at cost of inferior efficiency. To solve this problem, a local density-based algorithm named Local Density Based Outlier detection (LDBO) was proposed, which can improve outlier detection efficiency and effectiveness simultaneously. LDBO introduced definitions of strong k nearest neighbors and weak k nearest neighbors to realize outlier relation analysis of those data points located nearby. Furthermore, to improve the outlier detection efficiency, prejudgement was applied to avoid unnecessary reverse k nearest neighbor analysis as far as possible. Theoretical analysis and experimental results Indicate that LDBO outperforms INFLO in efficiency, and it is effective and feasible.
关 键 词:离群点检测 局部密度 强k近邻点 弱k近邻点 反向k近邻点集
分 类 号:TP274[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28