基于主题词的文本案例检索算法研究  

Algorithm Optimization about Textual Case Retrieval Based on Topic Words

在线阅读下载全文

作  者:孙镇 袁辉[2] 孙泰[2] 宫政[2] 赵捷[2] 汤磊[3] 

机构地区:[1]北京大学,北京 [2]全国组织机构代码管理中心,北京 [3]中国测绘科学研究院,北京

出  处:《计算机科学与应用》2013年第8期354-359,共6页Computer Science and Application

基  金:社会管理(微博客)实名备案技术及系统研究(No:201310027)资助课题;国家高技术研究发展计划(No:G1213)资助课题。

摘  要:分析传统文本检索方法布尔检索的本质,发现该检索方法存在两个缺点:检索算法忽略了词语之间的语义关系以及不能对检索结果进行重要性排序,针对于此提出利用基于主题词的改进检索算法。通过丰富主题词构建关键词库,在语义信息检索框架的基础上,计算关键词的语义距离和相似度。最后将改进后的算法应用到灾情案例检索系统中,并对检索结果做性能分析,实验证明该算法在文本检索的查准率和查全率上都有较好的改善。Two shortages of Boolean retrieval, ignoring the semantic relations between words and unable to rank the retrieval results in order of importance, were found by analyzing the essence of traditional text retrieval, and in view of which, an improvement of algorithm optimization based on topic words was proposed. Through enriching topic words to structure keywords library, the semantic distance and similarity of keywords were calculated on the basis of semantic retrieval framework. The improved algorithm was applied in the military case retrieval system at last, and then retrieval results were analyzed to detect performance. It is observed that the improved algorithm has a better improvement in both precision rate and recall rate of retrieval.

关 键 词:布尔检索 主题词 语义距离 改进检索算法 查准率 查全率 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象