基于闭包准则和成对约束的半监督聚类算法  

Semi-supervised clustering based on closure criterion and pairwise constraints

在线阅读下载全文

作  者:向力宏 金应华[1] 徐圣兵[1] XIANG Li-hong;JIN Ying-hua;XU Sheng-bing(School of Applied Mathematics,Guangdong University of Technology,Guangzhou 510520,China)

机构地区:[1]广东工业大学应用数学学院,广东广州510520

出  处:《佛山科学技术学院学报(自然科学版)》2020年第2期34-44,共11页Journal of Foshan University(Natural Science Edition)

摘  要:基于功效散度和成对约束的半监督聚类算法(PD-sSC)将相对熵推广到功效散度(PD)族,剔除了目标函数中不同惩罚熵项之间的干扰,提高了惩罚项系数的选择效率。但当成对约束数目相对较大时,PD-sSC算法聚类效果不够理想。为了解决这个问题,提出了一种基于闭包准则的成对约束打包算法(CCPC),该算法利用must-link约束对原样本组进行打包,再利用各个包的中心点替代整个包,从而得到一组新样本,最后利用PD-sSC算法对新样本进行聚类分析。实验结果表明,无论成对约束数目是大还是小,CCPC算法都有很好的表现。The semi-supervised clustering algorithm(PD-sSC)based on power-divergence and pairwise constraints generalize the relative entropy to the power divergence(PD)family,and get rid of the interference between different penalty entropy terms.However when the number of constraints is relatively large,the clustering effect of PD-sSC is not good.With aim to solve this problem,a pairwise constraint packing algorithm(CCPC)based on closure criteria is proposed.This algorithm uses the must-link constraints to package the original samples;then replaces the entire package with the center point of each package to obtain a new set of samples;at last uses PD-sSC algorithm to do clustering analysis for the new sample set.It needs note that the new sample set has updated cannot-link constraints without must-link constraints.Experiment outcome shows that CCPC algorithm has good clustering performance whenever the number of pairs of constraints is large or small.

关 键 词:闭包准则 极大熵聚类 成对约束 功效散度 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象