联合注意力机制与MatchPyramid的文本相似度分析算法  被引量:2

Text Similarity Analysis Algorithm Combining Attention Mechanism and MatchPyramid

在线阅读下载全文

作  者:代翔[1] 孙海春 朱容辰 孙天杨 DAI Xiang;SUN Haichun;ZHU Rongchen;SUN Tianyang(School of Information Network Security,People’s Public Security University of China,Beijing 100038,China)

机构地区:[1]中国人民公安大学信息网络安全学院,北京100038

出  处:《计算机工程与应用》2022年第19期158-165,共8页Computer Engineering and Applications

基  金:国家自然科学基金(41971367);国家重点研发计划项目(2017YFC0803700);公安部技术研究计划项目(2020JSYJC22ok)。

摘  要:文本相似度分析是自然语言处理领域的核心任务,基于深度文本匹配模型进行文本相似度分析是当前研究该任务的主流思路。针对传统的MatchPyramid模型对文本特征提取的不足之处进行改进,提出了基于增强Match-Pyramid模型进行文本相似度分析的方法。该方法在输入编码层加入多头自注意力机制和互注意力机制,同时对双注意力机制的输入词向量使用自编码器做降维处理,以降低模型的计算量。接着将双注意力机制的输出与原始词向量相连接,提升了词向量对文本关键信息的表征能力。最后将两个文本的词向量矩阵点积形成的单通道图映射到多个特征子空间形成了多通道图,使用密集连接的卷积神经网络对多通道图进行特征提取。实验结果表明,相比于传统的MatchPyramid模型,所提出的模型准确率提升了1.59个百分点,F1值提升了2.49个百分点。Text similarity analysis is the core task in the field of natural language processing,and text similarity analysis based on deep text matching model is the main idea of this task.Aiming at the shortcomings of traditional MatchPyramid model in text feature extraction,a text similarity analysis method based on enhanced MatchPyramid model is proposed.In order to reduce the computational complexity of the model,multi-head self-attention mechanism and mutual attention mechanism are added to the input encoding layer,and autoencoder is used to reduce the dimension of the input word vec-tor of dual attention mechanism.Then,the output of the dual attention mechanism is connected with the original word vec-tor to improve the representation ability of the word vector to the key information of the text.Finally,the single channel graph formed by the dot product of the word vector matrix of two texts is mapped to multiple feature subspaces to form a multi-channel graph,and the dense connected convolutional neural network is used to extract the features of the multi channel graph.The experimental results show that compared with the traditional MatchPyramid model,the accuracy of the proposed model is improved by 1.59 percentage points,and the F1 value is improved by 2.49 percentage points.

关 键 词:文本相似度 注意力机制 MatchPyramid 卷积神经网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象