改进Bi-LSTM的文本相似度计算方法  被引量:5

Text similarity calculation method using improved Bi-LSTM

在线阅读下载全文

作  者:冯月春[1] 陈惠娟[2] FENG Yue-chun;CHEN Hui-juan(School of Computer Science,Ningxia Institute of Science and Technology,Shizuishan 753000,China;School of Computer Science,Xi’an Polytechnic University,Xi’an 710000,China)

机构地区:[1]宁夏理工学院计算机学院,宁夏石嘴山753000 [2]西安工程大学计算机学院,陕西西安710000

出  处:《计算机工程与设计》2022年第5期1397-1403,共7页Computer Engineering and Design

基  金:宁夏回族自治区科学技术厅重点研发计划基金项目(2019BEB02020);宁夏高等学校科学研究基金项目(NGY2018-166)。

摘  要:为提高自然语言处理任务中文本相似度检测的准确率,提出一种改进双向长短期记忆网络(Bi-LSTM)的文本相似度计算方法。将输入的句子转换成多个单词向量,通过Bi-LSTM提取出每个单词向量的最佳词特征,引入注意力机制,减小非关键因素的影响;采用多层相似加权对两个句子分别从词与词、句子与句子、词与句子3个层面进行多层比较,加权得到其最终的相似度;基于SMTeuroparl、MSRvid和MSRpar这3个数据集对所提方法的性能进行评估。实验结果表明,相比于其它方法,所提方法的文本相似度计算更佳,适用于处理复杂的长文本。To improve the accuracy of text similarity detection in natural language processing tasks,a text similarity calculation method based on improved Bi-LSTM was proposed.The input sentence was transformed into multiple word vectors,the best word features of each word vector were extracted using Bi-LSTM,and attention mechanism was introduced to reduce the influence of non-key factors.The multi-level similarity weighting was used to compare the two sentences from three levels including word to word,sentence to sentence,word to sentence,and they were weighted to get the final similarity.The performance of the proposed method was evaluated based on SMTeuroparl,MSRvid and MSRpar data sets.The results show that compared with other methods,the proposed method has better text similarity calculation,and it is suitable for processing complex long text.

关 键 词:文本相似度 深度学习 双向长短期记忆网络 注意力机制 多层相似加权 上下文信息 

分 类 号:TP312[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象