结合双粒子群和K-means的混合文本聚类算法  被引量:16

Hybrid text clustering algorithm based on dual particle swarm optimization and K-means algorithm

在线阅读下载全文

作  者:王永贵[1] 林琳[1] 刘宪国[1] 

机构地区:[1]辽宁工程技术大学软件学院,辽宁葫芦岛125105

出  处:《计算机应用研究》2014年第2期364-368,共5页Application Research of Computers

基  金:国家自然科学基金资助项目(60903082);辽宁省教育厅项目(L2012113)

摘  要:传统K-means算法对初始聚类中心选择较敏感,结果有可能收敛于一般次优解,为些提出一种结合双粒子群和K-means的混合文本聚类算法。设计了自调整惯性权值策略,根据最优适应度值的变化率动态调整惯性权值。两子群分别采用基于不同惯性权值策略的粒子群算法进化,子代间及子代与父代信息交流,共享最优粒子,替换最劣粒子,完成进化,该算法命名为双粒子群算法。将能平衡全局与局部搜索能力的双粒子群算法与高效的K-means算法结合,每个粒子是一组聚类中心,类内离散度之和的倒数是适应度函数,用K-means算法优化新生粒子,即为结合双粒子群和K-means的混合文本聚类算法。实验结果表明,该算法相对于K-means、PSO等文本聚类算法具有更强鲁棒性,聚类效果也有明显的改善。As traditional K-means clustering algorithm is sensitive to the choice of initial cluster centers, the results may con- verge to the general suboptimal solutions, this paper presented a hybrid text clustering algorithm based on dual particle swarm optimization and K-means algorithm. It designed self-adjusting inertia weight strategy which used rate of change of optimal fit- ness to adjust the inertia weight automatically. Two populations used PSO based on different inertia weight strategies in the process of evolution. Two populations shared the best individual and eliminated the worst individual by exchanging information between the two groups of offsprings as well as offsprings and parents to complete the evolution. The algorithm was named dual particle swarm optimization. The algorithm combined balancing ability of global and local search of dual particle swarm optimi- zation with efficiency of K-means. Every particle was a group of clustering centers and reciprocal of sum of scatter within class was fitness function, then optimized newborn particle with K-means. This was called hybrid text clustering algorithm based on dual particle swarm optimization and K-means algorithm. The results of experiment show that compared with other text cluste- ring algorithms like K-means and PS0 et al, this algorithm has strong robustness and better clustering results.

关 键 词:双粒子群 自调整惯性权值 信息交流 K-MEANS算法 文本聚类 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] TP301.6[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象