基于滴水原理的关联聚类算法  

Correlation Clustering Based on Drop of Water Principle

在线阅读下载全文

作  者:华佳林[1,2] 于剑 HUA Jialin1,2,YU Jian1,2(1.School of Computer and Information Technology, Beijing Jiaotong University,Beijing 100044, China; 2.Beijing Key Laboratory of Traffic Data Analysis and Mining,Beijing 100044,Chin)

机构地区:[1]北京交通大学计算机与信息技术学院,北京100044 [2]交通数据分析与挖掘北京市重点实验室,北京100044

出  处:《计算机科学与探索》2018年第6期961-971,共11页Journal of Frontiers of Computer Science and Technology

基  金:国家自然科学基金Nos.61370129;61375062;高等学校博士学科点专项科研基金No.20120009110006;长江学者和创新团队发展计划No.IRT201206;中央高校基本科研业务费专项资金No.2014JBM029~~

摘  要:随着各种新兴媒体的发展,在数据挖掘领域出现了越来越多的新问题和新任务,关联聚类问题就是其中之一,最近受到越来越多的关注。现实中有很多问题可以使用关联聚类技术来处理,比如图像分割和垃圾邮件过滤等。大规模有符号图的出现越来越频繁,虽然之前有很多关联聚类算法被提出,但是很少算法能够处理规模很大的有符号图。提出了一个基于滴水原理的算法来处理大规模有符号图的聚类问题。算法过程包括:根据滴水原理来收缩图的规模,将一个水流上的所有点看成是一个新的点,这样可以极大地减小图的规模;在新的图中选出重要的点,并根据整数线性规划来判断邻居点是否合并。实验结果表明,该算法能够快速有效地进行大规模有符号图的聚类。With the development of various new media, there are many new issues popped up in data mining. Correlation clustering is one of these issues, that receives more and more attentions recently. It has been widely used in real applications such as image segmentation and spam filtering. Many methods have been proposed to process the issues.However, few of them can process large signed graphs which become common in practice. This paper proposes a novel algorithm based on the drop of water principle to tackle the problem of clustering large scale signed graphs.The algorithm is constructed according to the following procedures. Firstly, the signed graphs are shrank based on the drop of water principle. The points within a stream can be taken as a new point in the graph, the scale of signed graphs can be enormously decreased in this procedure. Secondly, some important points are chosen as seeds, and based on integer linear programming, the seeds can decide whether the neighbors joint the seeds or not. The experimental results show that the proposed method can cluster large scale graphs effectively and efficiently.

关 键 词:滴水原理 有符号图 关联聚类 整数线性规划 

分 类 号:TP30[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象