K近邻空间密度分布的模糊聚类算法  被引量:2

Fuzzy Clustering Algorithm for K-Nearest Neighbors Spatial Density Distribution

在线阅读下载全文

作  者:张利[1] 路颜萍 侯晴 张皓博 ZHANG Li;LU Yan-ping;HOU Qing;ZHANG Hao-bo(College of Information,Liaoning University,Shenyang 110036,China;College of Criminal Investigation Science and Technology,Criminal Investigation Police University of China,Shenyang 110854,China)

机构地区:[1]辽宁大学信息学院,辽宁沈阳110036 [2]中国刑事警察学院刑事科学与技术学院,辽宁沈阳110854

出  处:《辽宁大学学报(自然科学版)》2023年第4期289-301,F0002,共14页Journal of Liaoning University:Natural Sciences Edition

基  金:国家自然科学基金项目(62072220);辽宁省中央引导地方科技发展资金计划项目(2022JH6/100100032);辽宁省自然科学基金资助项目(2022-KF-13-06)。

摘  要:聚类是数据挖掘研究和应用中必不可少的工具,然而不完整数据对现有聚类算法提出了挑战.针对不完整数据聚类中插补方法带来的不确定性问题,本文提出一种K近邻空间密度分布的模糊聚类算法.首先,根据样本间相似度确定缺失数据的K最近邻样本集.在此基础上,由于缺失值具有不确定性,引入基于K最近邻样本集的数据分布信息,进一步将缺失数据填补为区间形式.其次,考虑聚类中离群点影响,引入数据空间密度分布,提出一种密度分布的区间型模糊C均值算法.最后,采用模糊C均值算法将填补的区间数据进行聚类.实验结果表明,在UCI数据集和人工数据集上,该算法能有效提高聚类准确性和鲁棒性.Clustering is an essential tool in data mining research and applications,but incomplete data poses challenges to existing clustering algorithms.Aiming at the uncertainty problem caused by interpolation in incomplete data clustering,a fuzzy clustering algorithm for K-nearest neighbors spatial density distribution is proposed.Firstly,the Knearest neighbors sample set of missing data is determined according to the similarity between samples.On this basis,due to the uncertainty of missing values,the data distribution information based on the K-nearest neighbors sample set is introduced to further fill the missing data into interval form.Secondly,considering the influence of outliers in clustering,an interval fuzzy C-means algorithm for density distribution is proposed by introducing spatial density distribution of data.Finally,fuzzy C-means algorithm is carried out on the filled interval data to clustering.The experimental results show that the algorithm can effectively improve the accuracy and robustness of clustering on UCI datasets and artificial datasets.

关 键 词:不完整数据 K近邻 模糊C均值 密度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象