问答系统中问句相似度研究被引量：3

Study on the Similarity of Question Sentences in Question and Answer System

作　　者：宋文闯刘亮亮张再跃[1] SONG Wen-chuang;LIU Liang-liang;ZHANG Zai-yue(School of Computer Science,Jiangsu University of Science and Technology,Zhenjiang 212003,China;School of Statistics and Information,Shanghai University of International Business and Economics,Shanghai 201620,China)

机构地区：[1]江苏科技大学计算机学院,江苏镇江212003 [2]上海对外经贸大学统计与信息学院,上海201620

出　　处：《软件导刊》2020年第7期148-152,共5页Software Guide

基　　金：国家自然科学基金项目(61371114,611170165);江苏高校高技术船舶协同创新中心/江苏科技大学海洋装备研究院项目(1174871701-9)。

摘　　要：百度知道中用户提出问题较短,采用常规基于空间向量的TF-IDF句子相似度计算、基于语义依存关系的句子相似度计算等方法往往很难较好完成其相似度计算。鉴于此,基于长度较短问句的特点,引入问题元和词模思想,对用户问题进行分解,并与传统相似度计算方法相融合,提出新的相似度计算方法。对于长度低于20个词的问句,与传统TF-IDF方法相比,F1值提高了12%。In view of the short length of questions raised by Baidu users,the conventional space vector-based TF-IDF sentence similarity calculation and the semantic similarity-based sentence similarity calculation are often difficult to perform good similarity calculation.To this end,this paper introduces the idea of problem element and lexical model for the characteristics of short-length question,decomposes the users’problems and then combines with the traditional similarity calculation method,and proposes a new similarity calculation method.For questions with a length of less than 20 words,the F1 value is increased by 12%compared to the traditional TF-IDF method.

关键词：问题元关键字扩展 TF-IDF 句子相似度问答系统

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

问答系统中问句相似度研究被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

问答系统中问句相似度研究 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

问答系统中问句相似度研究被引量：3