基于特定领域的加权语义相似度算法研究  被引量:1

Weighted Semantic Similarity Algorithm Based on Specific Area

在线阅读下载全文

作  者:高蕾娜[1] 史延枫[1] 李艳丹[2] 

机构地区:[1]成都大学机械工程学院,四川成都610106 [2]华中科技大学机械学院,湖北武汉430074

出  处:《成都大学学报(自然科学版)》2015年第3期259-261,274,共4页Journal of Chengdu University(Natural Science Edition)

摘  要:信息检索模块是自动问答系统中的主要组成部分.实现问题检索的关键问题是句子相似度计算问题.提出的基于特定领域的加权语义相似度算法,首先计算FAQ库中某问句关键词的权重,再利用语义相似度方法,分别计算目标问句各分词与FAQ库问句关键词的相似度矩阵,最后求得2个句子的最终相似度.逐一计算和比较目标问句与FAQ中每个问句的相似度,在大于一定阈值时,最大相似度所对应问句答案输出给用户.由于考虑词语语义和权重2方面信息,实验表明其具有较好的匹配效果.Information retrieval module is a major component in the automatic question-answering system. The key problem in the realization of question searching is the sentence similarity calculation. This paper presents a weighted semantic similarity algorithm based on specific area. It firstly calculates the weights of some question keywords in the FAQ library, and then the similarity matrixes are obtained by using semantic similarity method to calculate the similarity matrix between the separated keywords of target question and the question keywords from FAQ library. Finally, the similarity between two sentences is obtained. After calculating and comparing the similarity between the target question and each question in FAQ, it' s found that when the similarity is greater than a certain threshold value, the answer to the question of the corresponding maximum similarity will be output to the user. In terms of semantics and weights, the experiment shows that the better matching effects can be achieved.

关 键 词:自动问答系统 信息检索 相似度 语义 词语权重 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象