检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《计算机工程与应用》2006年第36期158-161,共4页Computer Engineering and Applications
基 金:教育部科学技术重点研究项目(03144);海南省自然科学基金资助项目(60533)
摘 要:首先介绍了统计语言模型(SLM)的发展及常用的N元(n-gram)模型,对信息检索过程中的主要模型作了公式化描述并比较了不同模型,指出了它们之间及与传统概率检索方法的异同,分析了统计语言模型的弱点,最后介绍了对其可能的改进方法及最新研究进展,讨论了在中文信息检索中的应用和面对的挑战。The use of statistical language modeling for the purpose of IR(Information Retrieval),a probability framework based approach,has achieved great success and has been studied extensively in recent years.It is generally regarded as the new trend of researches based on probability framework.This paper first studies the history of the development of Statistical Language Modeling (SLM),as well as the most commonly used N-gram model.Then formularized language modeling descriptions of IR processes are presented in detail.Comparisons are made between themselves and against traditional probability retrieval approaches.Finally the weakness of SLM is analyzed,possible ways of improvement are proposed and up to date research trends are discussed.Applications and challenges encountered in Chinese IR are also discussed.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249