一种基于加权概率密度的上下文离群检测算法  

A CONTEXTUAL OUTLIER DETECTION ALGORITHM BASED ON WEIGHTED PROBABILITY DENSITY

在线阅读下载全文

作  者:白慧 张继福[1] Bai Hui;Zhang Jifu(School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,Shanxi,China)

机构地区:[1]太原科技大学计算机科学与技术学院,山西太原030024

出  处:《计算机应用与软件》2024年第2期279-285,共7页Computer Applications and Software

基  金:国家自然科学基金项目(61876122)。

摘  要:采用加权概率密度,提出一种上下文离群数据检测算法。利用高斯混合模型和稀疏度矩阵,确定相关子空间;在相关子空间中,采用加权概率密度局部异常因子公式,计算数据对象的离群因子,可以有效反映和刻画数据对象与其周围数据对象的不一致程度;选取离群因子最大的N个数据对象为离群数据,并将离群因子、相关子空间属性取值、局部数据集作为其上下文信息,有效地改善了离群数据的可解释性;采用人工和UCI数据集,实验验证了算法的有效性。A contextual outlier data detection algorithm is proposed by using weighted probability density.In the algorithm,the Gaussian mixture model and the sparsity matrix were used to determine the correlation subspace.The weighted probability density local anomaly factor formula was used to calculate the outlier factor of the data object in the relevant subspace,which could effectively reflect and describe the degree of inconsistency between data objects and their surrounding data objects.N data objects with the largest outlier factor value were selected as outliers,and the value of outlier factor,correlation subspace attributes and local data sets were taken as their contextual information,effectively improving the interpretability and understandability of outlier data objects.Experimental results validate the effectiveness of this algorithm by using artificial data set and UCI data sets.

关 键 词:离群检测 相关子空间 加权概率密度 上下文信息 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象