基于公共词块及N-gram模型的问句相似度算法被引量：7

Question Similarity Algorithm Based on Common Chunks and N-Gram Model

出　　处：《重庆理工大学学报（自然科学）》2017年第10期175-179,197,共6页Journal of Chongqing University of Technology：Natural Science

基　　金：教育部人文社科青年项目(16YJC860010);重庆市社会科学规划博士项目(2015BS059)

摘　　要：问句相似度算法是问答系统的核心问题,直接影响着问答系统的准确性。针对公共词块算法(CCS)对于中文文本的不适用性,提出一种改进的问句相似度算法(CNS)。该方法结合N-gram模型及公共词块来计算问句向量的相似度,其主要思路是把问句分解成一元模型和二元模型,然后再分析问句之间的公共词块并考虑其顺序结构。实验结果表明:新算法在Top-N条数据集的平均相似度和不同相似度阈值下的准确率均优于常用的问句相似度算法。Question similarity algorithm is the key problem of QA,which directly affects the accuracy of QA. Aiming at the non applicability of the common chunks similarity algorithm（ CCS） to Chinese text,an improved question similarity algorithm（ CNS） is proposed,which combines the N-gram model and the common chunks to compute the similarity of the question vectors. The main idea is to break the question into unigram model and bigram model,then to analyze the common chunks between the questions and consider their sequential structure. Experimental results show that the new algorithm is better than the commonly used question similarity algorithms in the average similarity of Top-N data sets and the accuracy of different similarity threshold.

关键词：问句相似度 N-GRAM模型一元模型公共词块

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于公共词块及N-gram模型的问句相似度算法被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于公共词块及N-gram模型的问句相似度算法 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于公共词块及N-gram模型的问句相似度算法被引量：7