基于概念的信息检索模型研究  被引量:33

Research on the Concept-based Information Retrieval Model

在线阅读下载全文

作  者:李振东[1] 费翔林[1] 

机构地区:[1]南京大学计算机软件新技术国家重点实验室,南京210093

出  处:《南京大学学报(自然科学版)》2002年第1期99-109,共11页Journal of Nanjing University(Natural Science)

基  金:国家杰出青年基金 (6 15 2 5 2 0 4)

摘  要:随着Internet的迅速发展 ,WWW已经成为世界上最大的信息库 ,它正日益改变着人类的生活方式 .然而 ,由于WWW信息资源庞大 ,结构复杂 ,如何高效地从中找到需要的信息 ,已经成为困扰网络用户的一大难题 .许多著名的站点 ,如Yahoo ,AltaVista ,Infoseek均使用基于关键字的搜索引擎 ,存在明显的缺陷 ,当查询用的关键字与目标文档尽管语义相同 ,但用词不一致时 ,将检索失败 ,导致召回率很低 .提出一个基于概念的信息检索模型 ,它不是以关键字为核心 ,而是以概念为核心来实现信息检索 .着重介绍了基于概念的信息检索模型的设施。With the rapid development of Internet, World Wide Web has become a large information resource of the world. It changes the life mode of human being. However, because the resource is very big, and the structure is very complex, how to search and retrieve information efficiently and effectively becomes an important problem. The traditional search engines, such as Yahoo, AltaVista, InfoSeek are keyword-based search engine. They have an obvious default in common. When the word or phrase in the query is different from those used in the material you needs, searching with failed though these have a common sense. This leads to low recall. In this paper, we'll present a concept-based searching engine model. It uses concept instead of keyword as the kernel to complete the information search. This paper briefly introduces the facilities, methods and tools of the Concept-based Information Retrieval Model. The main contents of this paper are (1) to design and build the concept lexicon that supports the mapping between term and concept. At last these concepts can be found in the concept-tree;(2)to design and build the concept-tree that expresses the hierarchy of the knowledge and the relation among concepts. The concept lexicon and the concept-tree constitute the meta-knowledge of the model. The comprehension of concept will be based on it. We also discuss the semi-automatic algorithm to adjust the concept-tree and the concept lexicon;(3)to design the classification and search algorithm based on concepts.

关 键 词:信息检索 搜索引擎 概念树 概念词典 概念抽取 概念匹配 模型 计算机网络 

分 类 号:TP393[自动化与计算机技术—计算机应用技术] TP391.3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象