检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘大任[1] 孙焕良[2] 牛志成[3] 朱叶丽[4]
机构地区:[1]沈阳建筑大学学报编辑部,辽宁沈阳110168 [2]沈阳建筑大学理学院,辽宁沈阳110168 [3]沈阳建筑大学计算中心,辽宁沈阳110168 [4]沈阳建筑大学信息与控制工程学院,辽宁沈阳110168
出 处:《沈阳建筑大学学报(自然科学版)》2006年第1期149-153,共5页Journal of Shenyang Jianzhu University:Natural Science
基 金:辽宁省自然科学基金(20052006);辽宁省教育厅攻关项目(052354)
摘 要:目的提出一种聚类分析的新算法,解决聚类和同时检测孤立点的问题.方法结合SNN算法和LOF算法给出新算法-SNN LOF算法原理:(1)建立相似度矩阵;(2)去除噪声;(3)密度;(4)标记核心点;(5)计算每个数据点的lrd值;(6)由核心对象出发来形成一个聚;(7)取出被作为噪声的数据点;(8)计算被定义为噪声数据的LOF值,输出被视为孤立点的数据点.编制算法程序实现聚类和孤立点检测.结果用CURE数据集,DBSCAN聚类算法和SNN聚类算法结果相同,时间消耗是很接近的.但当数据上升到10 000以上时,SNN LOF算法聚类的效率明显要高于DB-SCAN算法,同时也检测到了孤立点.结论SNN LOF算法可以在聚类的同时发现孤立点.在大数据量时,SNN LOF算法的聚类时间效率明显要高于DBSCAN算法.A new algorithm on clustering analysis is put forward, by which the problem of clustering and the examination of the isolated point at the same time can be solved. The following new fundamentals of algorithm have been concluded by combining SNN algorithm with LOF algorithm:(1)to set up a similarity-matrix; (2) to eliminate noise; (3)density: (4) to mark the central point: (5)to calculate the lrd-value of each datum point; (6)to form a clustering, revolving around the central target;(7)to pick out the datum point serving as noise; (8)to calculate LOF-value defined as the noise datum, and output the datum point regarded as the isolated point. Algorithm is programmed to realize clustering and examination of the isolated point. The same result has been reached and almost the same amount of time has been consumed by using DATA data-collection, DBSCAN clustering algorithm and SNN algorithm respectively. In CURE data collection, when the data reach over 10 000, the clustering through SNN-LOF algorithm has higher efficiency than that through DBSCAN algorithm and meanwhile the isolated point is examined. In conclusion, SNN-LOF algorithm can achieve clustering and at the same time find out the isolated point. Provided with huge data, SNN-LOF algorithm consumes obviously less time than DBSCAN algorithm.
分 类 号:TP274[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.225.254.235