融合动态卷积注意力的机器阅读理解研究  

Study on Machine Reading Comprehension Hybriding Dynamic Convolution Attention

在线阅读下载全文

作  者:吴春燕 李理[1] 黄鹏程 刘知贵[1,2] 张小乾[2] WU Chun-yan;LI Li;HUANG Peng-cheng;LIU Zhi-gui;ZHANG Xiao-qian(School of Computer Science and Technology,Southwest University of Science and Technology,Mianyang 621000,China;School of Information Engineering,Southwest University of Science and Technology,Mianyang 621000,China)

机构地区:[1]西南科技大学计算机科学与技术学院,四川绵阳621000 [2]西南科技大学信息工程学院,四川绵阳621000

出  处:《计算机技术与发展》2023年第7期160-166,共7页Computer Technology and Development

基  金:国家自然科学基金(62102331,62176125,61772272)。

摘  要:针对机器阅读理解在采用长短期记忆神经网络和注意力机制处理文本序列信息时,存在特征信息提取不足和预测结果准确性不高的问题,提出了一种融合动态卷积注意力的片段抽取型机器阅读理解模型。该模型考虑到LSTM的当前输入和之前的状态相互独立,可能会导致上下文信息丢失,采用Mogrifier作为编码器,让当前输入与前一个状态充分交互多次,增强上下文和问题中的显著结构特征并减弱其次要特征;其次,由于静态卷积的卷积核相同,只能提取固定长度文本的特征,这可能对机器更好的理解文本产生阻碍,通过引入动态卷积,采用多个不同卷积核的一维卷积来捕获上下文和问题的局部结构,弥补注意力机制只有全局捕获能力的缺点。在SQuAD数据集上的实验结果表明,与其他模型相比,该方法有效提升了模型在特征信息提取和答案预测方面的能力。To solve the problems of insufficient feature information extraction and low accuracy of prediction results when using long short-term memory and attention mechanism to process text sequence information in machine reading comprehension,we propose a span-extracting machine reading comprehension model hybriding dynamic convolution attention.Considering that the current input and the previous state of LSTM are independent of each other,which may lead to the loss of context information,the Mogrifier is adopted as the encoder,which makes the current input fully interact with the previous state several times,so as to enhance the significant structural features in the context and the problem and weaken the secondary features.Secondly,because the convolution kernel of static convolution is the same,only the features of fixed length text can be extracted,which may hinder the machine from better understanding the text.By introducing dynamic convolution,one-dimensional convolution of multiple different convolution kernels is used to capture the local structure of the context and the problem,which makes up for the disadvantage that the attention mechanism has only global capture ability.Experimental results on SQuAD datasets show that compared with other models,the proposed method can effectively improve the model’s ability in feature information extraction and answer prediction.

关 键 词:机器阅读理解 片段抽取 答案预测 长短期记忆神经网络 动态卷积 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象