基于聚类和支持向量机相结合的热点发现  被引量:1

Hotspots Detection Based on the Combination of Clustering and Support Vector Machine

在线阅读下载全文

作  者:甘孟壮[1] 樊兴华[1] 

机构地区:[1]重庆邮电大学计算机科学与技术学院,重庆400065

出  处:《现代计算机》2013年第6期9-14,共6页Modern Computer

摘  要:为更及时且有效地挖掘出微博热点,提出一种基于无监督聚类和支持向量机相结合的热点发现方法。该方法运用热点事件之间的关联性,通过这种关联性来预测未知事件是热点事件的可能性。该算法首先通过对已标注的正例和反例训练SVM并获得SVM分类器。然后对测试集用K-means聚类算法进行聚类并获取热点簇,最后对每一簇中的样本使用SVM分类器进行分类,计算出每一簇中热点样本占该簇中总样本的比例。通过三种热度的计算方式。在相同的环境下进行测试,实验表明基于聚类和支持向量机相结合的热点发现方法,对热点发现具有良好的指导作用。To detect hotspots more timely and effectively mine hotspots on micro-blog, presents a hotspots detection method based on unsupervised clustering and SVM combined. The method involves the correlation among the use of hotspots events, this correlation to predict the possibility of unknown events become hot events. Firstly, trains SVM and SVM classifier. Then tests data with K-means clustering algorithm cluster and hotspots clusters. Finally, each sample in each cluster uses the SVM classifier to classify. Calculates the hot sample of each cluster proportion accounted for the total sample in the cluster. Through the heat of the formula, we calculate the hot of each cluster, sort and the final outcome. Designs three the heat calculation methods. Tests in the same environment, the experiments show that the combination of a hotspots detection method based on unsupervised clustering and SVM which has a good role in guiding the hotspots detection.

关 键 词:微博 热点发现 无监督聚类 支持向量机 热点率 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象