基于信息熵属性赋权的谱聚类算法研究

Research on Spectral Clustering Algorithm Based on Information Entropy Attribute Weighting

出　　处：《太原师范学院学报（自然科学版）》2017年第1期46-52,共7页Journal of Taiyuan Normal University:Natural Science Edition

基　　金：国家自然基金项目"用户行为数据"稀疏表示的理论与方法研究(61273294);山西省科技攻关计划项目高速网络入侵检测技术研究(20110321024-02)

摘　　要：针对谱聚类算法在处理较大规模的样本时,在存储空间和计算时间上都存在瓶颈问题,文章分析了目前常见的两种解决方式,即基于稀疏化t近邻的谱聚类和基于Nystr9m矩阵低秩逼近的谱聚类方法.为了进一步提高这两种谱聚类算法的准确度,提出了采取基于信息熵属性赋权的欧式距离来计算样本间的相似度的方法.首先,计算样本各属性的权重;然后,计算样本间的相似度矩阵并应用到稀疏化t近邻的谱聚类和Nystr9m矩阵低秩逼近的谱聚类方法中;最后,在多个数据集上进行了验证.实验结果表明该方法在一些数据集上的聚类精确度要比原来谱聚类算法高,尤其在Pendigits数据集上,基于信息熵赋权的稀疏化t近邻谱聚类比稀疏化t近邻谱聚类方法精确度提高15.11%.tional time when Spectral clustering suffers from a problem in both storage space and computa- a data set is larger. The current common two kinds of solutions were analyzed, one was based on sparse t-nearest-neighbor and another was based on Nystrom low-rank approxi- mation. In order to improve the accuracy of these two solutions, a method was proposed, in which the similarity between samples was calculated by the Euclidean distance based on informa- tion entropy attributes weighting. Firstly, the weight of each attribute of samples was valued. Secondly, the sample similarity matrix was computed and applied to the spectral clustering based on sparse t-nearest-neighbor and Nystrom low-rank approximation. Finally, its validation was performed on multiple data sets. Experimental results show that our algorithms outperform the original spectral clustering algorithm on some data sets in terms of accuracy. Specifically on the Pendigits, the accuracy of the spectral clustering based on information entropy attributes weigh- ting sparse t-nearest-neighbor increases by 15.11% compared with the spectral clustering based on sparse t-nearest-neighbor.

关键词：谱聚类信息熵稀疏化t近邻 Nystrm矩阵低秩逼近

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于信息熵属性赋权的谱聚类算法研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于信息熵属性赋权的谱聚类算法研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索