基于领域本体的语义向量空间模型  被引量:15

Semantic Vector Space Model Based on Domain Ontology

在线阅读下载全文

作  者:唐明伟[1] 卞艺杰[1] 陶飞飞[1] 

机构地区:[1]河海大学商学院,南京210098

出  处:《情报学报》2011年第9期951-955,共5页Journal of the China Society for Scientific and Technical Information

基  金:苏州市科技局资助项目“创新驿站的管理机制和网络信息平台研究”(项目编号:2010524712)

摘  要:经典向量空间模型中关键词相互独立的基本假设,造成了检索性能的限制。针对这一问题,本文介绍并分析了国内外学者对经典向量空间模型提出的改进研究。针对其研究的不足,通过分析经典向量空间模型的特点,构建领域本体以建立向量空间模型中关键词之间的语义联系,通过计算关键词之间的语义相似度,提出语义增量的概念,对关键词之间的语义联系进行量化分析。结合语义增量,对TF-IDF算法进行了改进,提出了STF-IDF算法,据此建立了语义向量空间模型,以期待提高经典向量空间模型在语义检索方面的性能。最后用实例验证了该模型在查全率和查准率方面均要优于原模型。The basic assumption about mutual independence among keywords in the classical vector space model results in the restrictions on its retrieval performance.Aiming at the problem,the paper introduces and analyzes the domestic and foreign scholars'researches on the improvement to the classical vector space model,then it builds the domain ontology to create the semantic relations between keywords of the vector space model by analyzing the feature of the classical vector space model for the deficiency of these researches,and proposes the concept of semantic increment by calculating the semantic similarity between keywords,and then uses the semantic increment to analyze the semantic relations between keywords quantitatively;combining the semantic increment,it makes an improvement to the TF-IDF algorithm,proposes the STF-IDF algorithm,and create the semantic vector space model in order to improve the retrieval performance of the classical vector space model in semantic retrieval;Finally it approves that the new model is better than the old one in recall and precision by the experiment.

关 键 词:领域本体 语义相似度 向量空间模型 TF-IDF 语义增量 

分 类 号:G354.4[文化科学—情报学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象