Concept-based approach for information retrieval  被引量:1

一种基于概念的信息检索方法(英文)

在线阅读下载全文

作  者:吴晨[1] 张全[1] 贾宁[1] 

机构地区:[1]中国科学院研究生院

出  处:《Journal of Southeast University(English Edition)》2006年第3期324-329,共6页东南大学学报(英文版)

基  金:The National Basic Research Program of China(973Program)(No.2004CB318104),the Knowledge Innovation Pro-gram of Chinese Academy of Sciences (No.13CX04).

摘  要:A concept-based approach is expected to resolve the word sense ambiguities in information retrieval and apply the semantic importance of the concepts, instead of the term frequency, to representing the contents of a document. Consequently, a formalized document framework is proposed. The document framework is used to express the meaning of a document with the concepts which are expressed by high semantic importance. The framework consists of two parts: the "domain" information and the "situation & background" information of a document. A document-extracting algorithm and a two-stage smoothing method are also proposed. The quantification of the similarity between the query and the document framework depends on the smoothing method. The experiments on the TREC6 collection demonstrate the feasibility and effectiveness of the proposed approach in information retrieval tasks. The average recall level precision of the model using the proposed approach is about 10% higher than that of traditional ones.为了获取词语在文章中的语义权重,解决词语的同义、多义模糊问题,提升信息检索的效率,提出了一种基于概念的检索模型,模型中设计了一种形式化的文本内容表示框架,框架由2部分构成:文章的“领域”以及“情景与背景”信息,并由概念(形式化语义)加以表示.同时,提出了提取该概念框架的方法,给出了用于框架与检索要求间匹配的两阶段平滑算法.实验表明,在TREC6提供的小规模语料集下,采用所提出方法的信息检索模型与传统模型相比,平均召回准确率提升了约10%,效果显著,充分说明了基于本文描述方法构建的、以概念作为处理中介的信息检索系统的有效性和可行性.

关 键 词:information retrieval CONCEPT semantic knowledge content representation 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象