基于多个领域本体的文本层次被定义聚类方法被引量：11

Text Hierarchical Clustering Based on Several Domain Ontologies

出　　处：《计算机科学》2010年第3期199-204,共6页Computer Science

基　　金：国家自然科学基金(No.60373099;60603031);国家教育部高等学校博士学科点专项科研基金(No.20060183044;200801830021);吉林省自然科学基金(No.20070533);吉林大学基本科研业务费交叉学科与创新项目(No.200810025)资助

摘　　要：传统的聚类方法常常将文本中关键词的相似度作为聚类的依据,丢失了很多重要的语义信息,导致聚类结果不够准确且计算量大。提出了一种基于多个领域本体的文本层次聚类方法,利用多个领域本体将用关键词表示的文本特征向量表示为与之匹配的概念向量集,定义文本相似度的计算公式,设计并实现基于多个领域本体的文本凝聚聚类算法。实验结果表明,该方法从概念层次上表示和处理文本,降低了聚类对象空间的维度,减少了计算量,提高了文本聚类的精确度和聚类效率。Thraditional clustering methods arc usually based on the similarity of keywords appearing in documents. Since these methods may lead to the loss of lots of semantic information, their clustering results are not accurate enough and often need large amount of computation. A new method for hierarchically clustering documents based on several domain ontologics was proposed. This method first transformed keyword-based vectors into corresponding concept-based vectors making use of several domain ontologies. Then, a formula was given for calculating similarities between different documents. An algorithm for document clustering based on several domain ontologics was proposed and its corresponding prime system was also designed and implemented. The experimental results show that our method can express and process documents from the perspective of concept semantics. It can decrease the amount of computation by reducing the dimension of the space of clustered objects and improve both the accuracy and the efficiency of document clustering.

关键词：领域本体相似度计算凝聚层次聚类

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多个领域本体的文本层次被定义聚类方法被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多个领域本体的文本层次被定义聚类方法 被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多个领域本体的文本层次被定义聚类方法被引量：11