机构地区:[1]南京财经大学江苏省电子商务重点实验室,南京210023 [2]南京理工大学计算机科学与工程学院,南京210094 [3]云境商务智能研究院南京有限公司,南京210003
出 处:《计算机学报》2021年第9期1824-1840,共17页Chinese Journal of Computers
基 金:国家重点研发计划(2019YFB1405000);国家自然科学基金(71871109);国家自然科学基金重点支持项目(92046206)资助.
摘 要:以微博、微信为代表的社交网络不仅包含丰富的节点属性信息,还蕴含复杂的网络拓扑信息,这些社交网络通常可被建模为属性图.传统的图聚类方法假设节点属性与网络拓扑共享同一类簇结构.然而,在真实社交网络中,节点属性与网络拓扑所对应的类簇结构并非完全一致.譬如,通过社团发现技术分析新浪微博的好友关注列表能够直观地获取聚集在同一群组的用户集合;而借助文本挖掘技术分析同一群组的用户生成内容却会发现用户讨论话题的分布广泛,体现出差异化的用户偏好特征.如何有效融合属性与拓扑信息对属性图进行聚类是理解、分析和可视化大规模社交网络的关键难题之一.为此,本文将属性图聚类建模为多目标优化问题,提出一种基于动态类簇形成博弈的属性图聚类方法.首先定义一种新颖的中心性指标度量节点的影响力,并提出一种启发式方法初始化属性图类簇质心;其次在动态博弈理论框架下,提出一种贪心的局部搜索策略更新节点类簇标签,并严格证明该局部搜索策略可使类簇结构收敛至局部帕累托最优解;最后设计一种基于多智能体自治计算的属性图聚类算法,该算法无需预设初始类簇个数,且复杂度近似线性于边的数目.为验证本文所提算法的性能,我们依次从三个方面来对其进行测试和评估.首先我们在Google+属性图上对所提算法进行了单独的收敛性分析.我们测试了算法中四个需要优化的目标函数(K-means损失函数、Havrda-Charvat生成熵、负模块度和负紧凑度)在三个不同的Bregman散度(欧氏距离平方、KL散度距离和余弦距离)设置下的收敛性情况.实验结果表明,四个目标函数能在50轮迭代之后达到收敛状态.然后,我们在4个大规模属性图上分别从聚类精度和可扩展性两个方面将本文所提算法与9个基准方法作了充分对比.对比结果表明,本文所提Except for rich node attribute information,there is the complex topological information in some modern online social networks,such as Sina Weibo and WeChat.Such types of social network can usually be represented as an attributed graph.Traditional graph clustering approaches are often based on an assumption that the node attributes and network topology share a same cluster membership.However,it does not always hold in many real-world social networks.Take Sina Weibo as an example,analyzing the follow lists of Weibo users through community detection techniques can directly obtain which users gather into a social group,while these users may produce diverse user-generated content,reflecting differentiated preference characteristics.How to effectively integrate attributive and topological information for clustering attributed graphs becomes a new challenge,which is also critical for understanding,analyzing as well as visualizing large-scale social networks.In this paper,we formulated the target problem as a multi-objective optimization problem,and proposed a dynamic cluster formation game based attributed graph clustering approach.First,we defined a new centrality index,called the influence of nodes,to measure the node influence and designed an effective heuristic method to initialize the cluster centroids of attribute graphs.Second,based on the dynamic game theory,a greedy local search strategy was proposed to update the cluster labels of nodes,and we strictly proved that such local search strategy can make the cluster structure converge to the local Pareto optimality.Third,an autonomy-oriented computing based attributed graph clustering algorithm was proposed,which does not need to specify the cluster number and its running time scales linearly with the total number of edges.Furthermore,we tested and evaluated the proposed approach’s performance from three aspects.First,we performed a separate convergence analysis for the proposed approach on the Google+attributed social network.We tested the convergence of four ob
关 键 词:属性图聚类 多目标优化 动态类簇形成博弈 局部帕累托最优 自治计算
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...