一种面向枢纽现象的离群数据检测算法  被引量:3

Outlier Detection Algorithm for Hubness Phenomenon

在线阅读下载全文

作  者:马文强 赵旭俊[1] 张继福[1] 饶元淇 MA Wen-qiang;ZHAO Xu-jun;ZHANG Ji-fu;RAO Yuan-qi(School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)

机构地区:[1]太原科技大学计算机科学与技术学院,太原030024

出  处:《小型微型计算机系统》2020年第5期919-924,共6页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(61572343,U1731126,U1931209)资助;山西省应用基础研究计划项目(201901D111257,201901D211303)资助;山西省重点研发计划项目(201803D121059)资助;太原科技大学科研启动基金项目(20192013)资助.

摘  要:在高维数据中,逆k近邻查询会导致出现枢纽现象,这严重影响了基于逆k近邻离群检测算法的性能.为解决这一问题,提出了一种面向枢纽现象的双向近邻离群检测算法.该算法首先引入并重新定义了对象的影响空间,在影响空间中,同时兼顾了对象的k近邻和逆近邻的影响作用,有效提高了算法的准确性;其次,引入了启发式信息,不仅考虑对象的离群程度同时还考虑其k近邻的离群情况,显著降低了k的取值,从而减少了算法的计算量和运行时间;最后,采用真实数据集,实验验证了本文算法同传统的基于枢纽现象的离群挖掘算法相比具有更高的效率和准确性.In high-dimensional data,the reverse k-nearest neighbor query leads to a Hubness phenomenon,which seriously affects the performance of the reverse k-nearest neighbor detection algorithm.To solve this problem,we propose a bidirectional nearest-neighborbased outlier detection algorithm for the Hubness phenomenon.Firstly,we introduces and redefines the influence space of the object,in which the effect of k-nearest neighbor and reverse-neighbor are fused into our algorithm.Such a novel influence space effectively improves the accuracy of the algorithm;Secondly,we employ the heuristic information,which considers not only the degree of outliers of the object but also the outliers of its k-nearest neighbors.The use of this information significantly decreases the value of k,and reduces the computational complexity and running time of the algorithm;Finally,experimental results driven by real datasets validate that the proposed algorithm is more efficient and accuracy than other outlier detection algorithms.

关 键 词:离群数据检测 影响空间 枢纽现象 逆k近邻 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象