MSHC:一种多阶段超图聚类算法  

MSHC:a multi-stage hypergraph clustering algorithm

作  者:张春英 王静 刘璐 兰思武 张庆达 ZHANG Chunying;WANG Jing;LIU Lu;LAN Siwu;ZHANG Qingda(College of Science,North China University of Science and Technology,Tangshan Hebei 063210,Hebei Province,P.R.China;Hebei Key Laboratory of Data Science and Application,North China University of Science and Technology,Tangshan 063210,Hebei Province,P.R.China;The Key Laboratory of Engineering Computing in Tangshan City,North China University of Science and Technology,Tangshan 063210,Hebei Province,P.R.China;Tangshan Intelligent Industry and Image Processing Technology Innovation Center,North China University of Science and Technology,Tangshan 063210,Hebei Province,P.R.China;Hebei Engineering Research Center for the Intelligentization of Iron Ore Optimization and Ironmaking Raw Materials Preparation Processes,North China University of Science and Technology,Tangshan Hebei 063210,P.R.China)

机构地区:[1]华北理工大学理学院,河北唐山063210 [2]华北理工大学河北省数据科学与应用重点实验室,河北唐山063210 [3]华北理工大学唐山市工程计算重点实验室,河北唐山063210 [4]华北理工大学唐山市智能工业与图像处理技术创新中心,河北唐山063210 [5]华北理工大学铁矿石优选与铁前工艺智能化河北省工程研究中心,河北唐山063210

出  处:《深圳大学学报(理工版)》2025年第1期68-76,共9页Journal of Shenzhen University(Science and Engineering)

基  金:河北省属高校基本科研业务费资助项目(JST2022001)。

摘  要:超图作为普通图的高维推广,能够更加灵活地反映节点间的高阶复杂关系.超图聚类旨在发现超图结构中复杂的高阶关联关系.针对目前超图聚类结果不稳定、容易陷入局部最优等问题,结合超图划分思想,提出一种多阶段超图聚类(multi-stage hypergraph clustering,MSHC)算法,该算法将超图聚类过程分为超图约简、超图初始聚类以及优化迁移3个阶段.在超图约简阶段,提出一种不改变超图结构的快速约简方法,降低了后续算法的复杂度;提出基于集对分析理论的超图节点间相似性度量方法,并采用层次聚类方法对超图进行初始聚类,采用4种不同的类簇合并计算方法,增加聚类方案的多样性;将遗传算法应用于优化超图聚类方案的研究中,以此获得最优超图聚类方案.在3个不同规模的数据集上与4个经典的超图聚类方法进行对比实验,结果表明,MSHC算法在Songs_genres数据集和Papers_keywords数据集上超图模块度指数分别提高了0.0797和0.0777,在Movies_genres数据集上仅降低0.0060.As a high-dimensional extension of ordinary graphs,hypergraphs can more flexibly reflect high-order complex relationships between nodes.Hypergraph clustering aims to discover complex high-order correlations in powerful hypergraph structures.In response to challenges faced by current hypergraph clustering algorithms,such as extremely high complexity,unstable clustering results,and the tendency to fall into local optima,a multi-stage hypergraph clustering algorithm denoted as MSHC is proposed based on the idea of hypergraph partitioning.This algorithm divides the hypergraph clustering process into three stages:hypergraph reduction,hypergraph initial clustering,and optimization migration.In the first stage,a fast reduction method that preserves the hypergraph structure is proposed to reduce the complexity of subsequent algorithms.In the second stage,a similarity measurement method between hypergraph nodes based on set pair analysis theory is introduced,and hierarchical clustering algorithm is applied for initial clustering.Four different cluster merging strategies are employed to increase the diversity of clustering schemes.In the final stage,the genetic algorithm is applied to obtain the optimal hypergraph clustering scheme.Comparative experiments are conducted with two traditional hypergraph clustering algorithms on three data sets with different sizes.Experimental results show that the hypergraph modularity index of the MSHC algorithm is improved by 0.079 and 0.077on the Songs_genres and Papers_keywords datasets respectively,and is only reduced by 0.006 on the Movies_genres dataset.

关 键 词:数据处理 超图聚类 遗传算法 集对分析理论 超图约简 多阶段聚类 超图模块度 

分 类 号:TP274[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象