考虑边界样本邻域归属信息的粗糙K-means增量聚类算法  被引量:7

Rough K-means incremental clustering algorithm considering neighborhood belonging information of boundary samples

在线阅读下载全文

作  者:马福民[1] 孙静勇 张腾飞 MA Fu-min;SUN Jing-yong;ZHANG Teng-fei(College of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210023,China;College of Automation&College of Artificial Intelligence,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)

机构地区:[1]南京财经大学信息工程学院,南京210023 [2]南京邮电大学自动化学院、人工智能学院,南京210023

出  处:《控制与决策》2022年第11期2968-2976,共9页Control and Decision

基  金:国家自然科学基金项目(61973151,62073173);江苏省自然科学基金项目(BK20191406,BK20191376)。

摘  要:在原有数据聚类结果的基础上,如何对新增数据进行归属度量分析是提高增量式聚类质量的关键,现有增量式聚类算法更多地是考虑新增数据的位置分布,忽略其邻域数据点的归属信息.在粗糙K-means聚类算法的基础上,针对边界区域新增数据点的不确定性信息处理,提出一种基于邻域归属信息的粗糙K-means增量式聚类算法.该算法综合考虑边界区域新增数据样本的位置分布及其邻域数据点的类簇归属信息,使得新增数据点与各类簇的归属度量更为合理;此外,在增量式聚类过程中,根据新增数据点所导致的类簇结构的变化,对类簇进行相应的合并或分裂操作,使类簇划分可以自适应调整.在人工数据集和UCI标准数据集上的对比实验结果验证了算法的有效性.The key to improve the quality of incremental clustering is how to assign the new data to different clusters on the basis of original data clustering results.The existing incremental clustering algorithms mostly consider the location distribution of the newly added data point,and ignore the belonging information of the neighbor points around the new data point.To deal with the uncertain information of new data points that fall into boundary regions of original clusters,based on the rough K-means clustering,a rough K-means incremental clustering algorithm is developed.In this algorithm,focusing on the assignment of the newly added data in the boundary region,the neighborhood belonging information of the new data is taken into consideration,so that the hybrid measure of the new data point belonging to different clusters is more reasonable.Furthermore,the clusters will be merged or split to make the new divided clusters becoming more reasonable according to the cluster structure changes caused by the new data.The validity of the proposed algorithm is demonstrated by the experimental results on the artificial data sets and UCI standard data sets.

关 键 词:粗糙K-means聚类 增量聚类 邻域归属信息 类簇结构 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象