融合SOM和改进PSO的Web文档集成聚类算法  被引量:2

Integrated clustering algorithm based on hybrid of SOM and improved PSO for Web document

在线阅读下载全文

作  者:宋剑杰[1] 王伟[2] 

机构地区:[1]湖南科技职业学院电子信息系,长沙410004 [2]中南大学信息科学与工程学院,长沙410083

出  处:《计算机工程与应用》2010年第34期111-114,共4页Computer Engineering and Applications

摘  要:随着信息的爆炸式增长,现有的搜索引擎在很多方面不能满足人们的需要。Web文档聚类可以减小搜索空间,加快检索速度,提高查询精度。提出了一种融合SOM(Self-Organizing Maps)粗聚类和改进PSO(Particle Swarm Optimization)细聚类的Web文档集成聚类算法。首先根据向量空间模型表示法,用特征词条及其权值表示Web文档信息,其次用SOM算法对文档特征集进行粗聚类,得到一组输出权值,然后用这组权值初始化改进的PSO算法,用改进PSO算法对此聚类结果进行细化,最终实现Web文档聚类。仿真结果表明,该算法能有效提高文档查询的查准率和查全率,具有一定的实用价值。With the explosive growth of Web information in Internet,it seems that the current search engines cannot meet the requirement of users in many aspects.By grouping similar Web documents into clusters, the search space can be reduced, the search accelerated,and its precision improved.An integrated clustering algorithm for Web document is proposed in this paper,which combines SOM to realize coarse clustering and the improved PSO to realize fine clustering.Firstly,the Web document is expressed as feature lemma and its weight by the vector space model.Secondly,the SOM algorithm is used to realize coarse clustering of the document feature set and a group of output weights can be obtained.Then the improved PSO algorithm is initialized with the output weights and fine clustering can be realized by the algorithm evolution,thus Web document clustering is implemented finally.Simulation result shows that the algorithm can greatly improve the precision and recall of document searching,and have certain practical value.

关 键 词:WEB文档聚类 自组织特征映射 粗聚类 改进PSO算法 细聚类 集成聚类算法 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象