检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]湖南科技职业学院电子信息系,长沙410004 [2]中南大学信息科学与工程学院,长沙410083
出 处:《计算机工程与应用》2010年第34期111-114,共4页Computer Engineering and Applications
摘 要:随着信息的爆炸式增长,现有的搜索引擎在很多方面不能满足人们的需要。Web文档聚类可以减小搜索空间,加快检索速度,提高查询精度。提出了一种融合SOM(Self-Organizing Maps)粗聚类和改进PSO(Particle Swarm Optimization)细聚类的Web文档集成聚类算法。首先根据向量空间模型表示法,用特征词条及其权值表示Web文档信息,其次用SOM算法对文档特征集进行粗聚类,得到一组输出权值,然后用这组权值初始化改进的PSO算法,用改进PSO算法对此聚类结果进行细化,最终实现Web文档聚类。仿真结果表明,该算法能有效提高文档查询的查准率和查全率,具有一定的实用价值。With the explosive growth of Web information in Internet,it seems that the current search engines cannot meet the requirement of users in many aspects.By grouping similar Web documents into clusters, the search space can be reduced, the search accelerated,and its precision improved.An integrated clustering algorithm for Web document is proposed in this paper,which combines SOM to realize coarse clustering and the improved PSO to realize fine clustering.Firstly,the Web document is expressed as feature lemma and its weight by the vector space model.Secondly,the SOM algorithm is used to realize coarse clustering of the document feature set and a group of output weights can be obtained.Then the improved PSO algorithm is initialized with the output weights and fine clustering can be realized by the algorithm evolution,thus Web document clustering is implemented finally.Simulation result shows that the algorithm can greatly improve the precision and recall of document searching,and have certain practical value.
关 键 词:WEB文档聚类 自组织特征映射 粗聚类 改进PSO算法 细聚类 集成聚类算法
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171