改进的基于PSO的离群点检测算法  被引量:1

Improved PSO-based algorithm for outlier detection

在线阅读下载全文

作  者:王美晶[1] 叶东毅[1] 

机构地区:[1]福州大学数学与计算机科学学院,福州350108

出  处:《计算机应用》2012年第A01期139-143,共5页journal of Computer Applications

基  金:福建省自然科学基金资助项目(2010J01329);福建省高校产学研重大项目(2010H6012)

摘  要:针对Mohemmed等新近提出的基于粒子群优化(PSO)算法的离群点检测方法(MOHEMMED A,ZHANGM,BROWNE W.Particle swarm optimisation for outlier detection[C]//GECCO'10:Proceedings of the 12th AnnualConference on Genetic and Evolutionary Computation.Oregon,Portland:ACM,2010:83-84)可能出现适应值和相应数据对象的离群度不匹配的不合理现象,分析了存在这种现象的原因,并提出一种改进的适应值函数。新的适应值调整了对不合理邻域半径估值的惩罚力度,从而弱化粒子适应值和对象离群度之间的偏差;算法在解空间范围内搜索近似最优粒子,以确定合适的邻域半径估值;最终基于该半径估值衡量各数据对象的离群度。通过对若干UCI数据集的实验表明,采用新的适应值函数的离群检测算法优于原有方法和LOF方法。所提算法不仅解决了上述存在的问题,离群点检测效果也更突出,这表明合理定义适应值函数有助于提高算法的检测质量。A new outlier detection method based on Particle Swarm Optimization (PSO) was recently proposed by Mohemmed, et al. ( MOHEMMED A, ZHANG M, BROWNE W. Particle swarm optimisation for outlier detection [C]// GECCO'10: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. Oregon, Portland: ACM, 2010:83 - 84). There exists an unreasonable phenomenon that its way of defining the fitness function does not necessarily ensure a good match with outlying degree of an object. A new fitness function by weakening the penalty on unreasonable radiuses was proposed so that the deviation between a particle's fitness and outlying degree of the corresponding data object was narrowed. The algorithm searched for an approximate optimal solution, and the radius was then determined to compute the outlying degree of each object. The experimental results on several UCI datasets show the superiority of the proposed outlier detection method with the new fitness function over the original one and the LOF algorithm. The study shows that a reasonable definition of fitness function contributes to the improvement in quality of outlier detection.

关 键 词:数据挖掘 离群点检测 粒子群优化 离群度 适应值函数 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构] TP18[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象