一种基于K—means算法的主题数据库规划方法

An Approach to Subject Database Planning Based on K-means

作　　者：范青蓝[1] 汪林[1] 刘亚清[2] 白惇鲁明羽[2] Qinglan Fan;Lin Wang;Yaqing Liu;Dun Bai;Mingyu Lu(Research Institute of Highway Ministry of Transport,Beijing 100088,China;School of Information Science ＆ Technology,Dalian Maritime University,Dalian 116026,China)

机构地区：[1]交通运输部公路科学研究院,北京100088 [2]大连海事大学信息科学技术学院,辽宁大连116026

出　　处：《信息工程期刊（中英文版）》2015年第6期173-176,共4页Scientific Journal of Information Engineering

基　　金：受“面向ITS体系框架的交通运输数据资源规划研究”支持资助.

摘　　要：主题数据库规划一直是信息资源规划领域研究的重点，而实体聚合算法是影响主题数据库规划质量的关键。但是现有的计算实体聚合毖方法很容易陷入聚簇偏置，影响了规划质量。针对这一问题，作者首先计算实体对的亲和毖，然后将实体对的亲和关系看作网页之间的链接关系，使用PageRaxtk算法对实体对重要性排序，进而使用K—means算法迭代来聚合实体。实验结果表明本文提出的方法能够避免聚簇偏置，进而改善了主题数据库规划质量。Subject database planning is always the emphasis of information resource planning. Algorithm for entities aggregation has heavy impact on the quality of subject database planning. However, the existing approaches to entities aggregation computation are inclined to fall into cluster offset, which does great harm to the quality of subject database planning. Against the problem, we firstly calculate the degree of aggregation between entities. Secondly, we view the relations of aggregation as the relations of links between web pages. We apply PageRank algorithm to sort all entity pairs by importance. At last, we exploit K-means algorithm to aggregate entities iteratively. The results of experiments show that our approach avoids cluster offset and improves the quality of subject database planning.

关键词：主题数据库信息资源规划聚簇偏置

分类号：TP[自动化与计算机技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于K—means算法的主题数据库规划方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于K—means算法的主题数据库规划方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索