检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]广西大学数学与信息科学学院,南宁530004 [2]四川大学计算机学院,成都610065
出 处:《计算机应用研究》2011年第5期1731-1733,共3页Application Research of Computers
基 金:广西大学科研基金资助项目(XJZ100258)
摘 要:针对现有的孤立点检测算法在通用性、有效性、用户友好性及处理高维大数据集的性能还不完善,提出了一种快速有效的基于层次聚类的全局孤立点检测方法。该方法基于层次聚类的结果,根据聚类树和距离矩阵可视化判断数据孤立程度,并确定孤立点数目。从聚类树自顶向下,无监督地去除孤立点。仿真实验验证了本方法能快速有效识别全局孤立点,具有用户友好性,适用于不同形状的数据集,可用于大型高维数据集的孤立点检测。The existing outlier detection algorithms should be improved due to their versatility,effectiveness,user-friendliness,and the performance in processing high-dimensional and large databases.This paper proposed a fast and effective hierarchical clustering based global outlier detection approch.Agglomerative hierarchical clustering was performed firstly,and then the isolated degree of the data could be visually judged and the number of the outliers could be determined based on the clustering tree and the distance matrix.After that,the outliers was identified unsupervisedly from the top to down of the clustering tree.Experimental results show that,this approach can identify global outliers fastly and effectively,and is user-friendly and capable at datasets of various shapes.Experiments also illustrate that this approach is suitable for using on high-dimensional and large databases.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.101.157