基于字向量和增强表示BiLSTM句子相似度研究  被引量:2

Research on Sentence Similarity Based on Character Vector and Enhanced Representation BiLSTM

在线阅读下载全文

作  者:贾畅 叶飞 刘帅君 麻之润 JIA Chang;YE Fei;LIU Shuai-jun;MA Zhi-run(School of Big Data,Yunnan Agricultural University,Kunming 650201,China)

机构地区:[1]云南农业大学大数据学院,云南昆明650201

出  处:《计算机技术与发展》2020年第10期97-100,186,共5页Computer Technology and Development

基  金:云南省重大科技专项(2018ZI001-2)。

摘  要:目前分词工具在金融领域智能客服中无法对金融相关词汇进行有效切分,且基于单词的模型更容易受到数据稀疏性和词汇表外单词的影响。针对该问题,提出一种基于字向量和增强表示BiLSTM的句子相似度计算模型—EBiLSTM。该模型首先通过双向长短时记忆网络BiLSTM提取由字嵌入组成的句子的字特征及其上下文表示,然后计算句子对中一个句子与另一个句子的软对齐表示,在此基础上通过句子表示与其对齐表示间的交互来增强最终的句子表示。所提模型可以有效学习到句子对的语义关系,加入增强表示层后通过两个句子的交互可以更好地捕捉两个句子间的语义差异。实验表明,所提模型在真实数据集上,精确率、召回率和F1值均优于基于词向量的CNN和BiLSTM方法,也优于基于字向量的CNN和BiLSTM方法。Currently word segmentation tools cannot effectively segment financial-related vocabulary in intelligent customer service in the financial field,and word-based models are more susceptible to data sparsity and out-of-vocabulary words.Aiming at this problem,EBiLSTM,a sentence similarity calculation model based on BiLSTM based on character vector and enhanced representation,is proposed.The model first extracts the word features and contextual representation of a sentence composed of words through a bi-directional long-term short-term memory network BiLSTM,and then calculates the soft-aligned representation of one sentence and another sentence in the sentence pair,and then aligns it with the sentence representation.Inter-representation interactions enhance the final sentence representation.The proposed model can effectively learn the semantic relationship of sentence pairs.After adding the enhanced presentation layer,the semantic differences between two sentences can be better captured through the interaction of the two sentences.Experiment shows that the proposed model is better than the CNN and BiLSTM methods based on word vectors and the CNN and BiLSTM methods based on character vectors in terms of precision,recall and F1.

关 键 词:智能客服 句子相似度 循环神经网络 字向量 句子对齐 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象