一种基于Huffman树的FCM聚类算法  被引量:2

A Fuzzy C-Means Algorithm Based on Huffman Tree

在线阅读下载全文

作  者:肖满生 周丽娟 文志诚 Xiao Mansheng;Zhou Lijuan;Wen Zhicheng(School of Computer Science,Hunan University of Technology,Zhuzhou 412007,China)

机构地区:[1]湖南工业大学计算机学院,株洲412007

出  处:《数据分析与知识发现》2018年第7期81-88,共8页Data Analysis and Knowledge Discovery

基  金:2018年湖南省自然科学基金项目"非归一化约束下模糊C均值聚类及其在图像处理中的应用研究"(项目编号:2018JJ4068);2016年湖南省教育厅科研项目"基于数据融合的网络态势感知技术研究"(项目编号:16C0480)的研究成果之一

摘  要:【目的】解决传统的FCM算法随机选取初始聚类中心、对噪声敏感、只适合均衡分布的样本聚类问题。【方法】提出一种基于Huffman树的FCM新算法,该算法设计一种高密度样本的相异度矩阵构建Huffman树并获取初始聚类中心,进而给出非归一化约束的样本隶属度函数。【结果】通过人造样本及图像数据集、UCI数据集的实验对比结果表明,算法在聚类精度、运算时间等指标上比基于高斯核FCM算法及传统FCM算法更有优势。【局限】仅凭实验或经验确定样本密度调节因子?,尚缺乏理论依据。【结论】本研究在现实生活中对含有大量噪声样本及样本分布非均衡的数据集聚类有一定的实际应用价值。[Objective] This paper tries to solve the issues lacing traditional FCM algorithm, such as randomly choosing initial cluster center, sensitive to noise, and only capable of clustering the equally distributed samples. [Methods] We proposed a new FCM chlstering algorithm based on Huffman tree with dissimilarity degree matrix of high density sample sets. The new algorithm could get initial clustering centers, and then generate the membership function of the non-normalized constraint samples. [Results] We examined the proposed algorithm with man-made samples, images, and UCI datasets. The clustering accuracy and the computation time of the new algorithm were better than algorithms based on the Gauss kernel or traditional FCM. [Limitations] The β of the sample density adjustment factor was decided by experiment or experience without theoretical supports. [Conclusions] The proposed algorithm could be used for clustering data sets with high level of noise and distributed unequally.

关 键 词:样本密度 相异度 HUFFMAN树 隶属度 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] G35[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象