检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗森林[1] 王睿怡 吴倩 潘丽敏[1] 吴舟婷 LUO Senlin;WANG Ruiyi;WU Qian;PAN Limin;WU Zhouting(School of Information and Electronics,Beijing Institute of Technology,Beijing 100081,China;NationalComputer Network Emergency Response Technical Team/Coordination Center,Beijing 100094,China)
机构地区:[1]北京理工大学信息与电子学院,北京100081 [2]国家计算机网络应急技术处理协调中心,北京100094
出 处:《北京理工大学学报》2021年第1期93-101,共9页Transactions of Beijing Institute of Technology
基 金:国家“十二五”科技支撑计划项目(2012BAI10B01);北京理工大学基础研究基金项目(20160542013);国家“二四二”信息安全计划项目(2017A149)。
摘 要:针对基于编码-解码的生成式摘要模型不能充分提取语法知识导致摘要出现不符合语法规则的问题,循环神经网络易遗忘历史信息且训练时无法并行计算导致处理长文本时生成的摘要主旨不显著以及编码速度慢的问题,提出了一种融合序列语法知识的卷积-自注意力生成式摘要方法.该方法对文本构建短语结构树,将语法知识序列化并嵌入到编码器中,使编码时能充分利用语法信息;使用卷积-自注意力模型替换循环神经网络进行编码,更好学习文本的全局和局部信息.在CNN/Daily Mail语料上进行实验,结果表明提出的方法优于当前先进方法,生成的摘要更符合语法规则、主旨更显著且模型的编码速度更快.Abstractive summarization is to analyze the core ideas of the document, rephrase or use new words to generate a summary that can summarize the whole document. However, the encoder-decoder model can not fully extract the syntax, that cause the summary not to match the grammar rules. The recurrent neural network is easy to forget the historical information and can not perform parallel computation during training, that cause the main idea of the summary not significant and the coding speed slow. In view of the above problems, a new abstractive summarization method with fusing sequential syntax was proposed for the convolution-self attention model. First, constructing a phrase structure tree for the document and embeding sequential syntax into the encoder, the method could make better use of the syntax when encoding. Then, the convolution-self-attention model was used to replace the recurrent neural network model to encode, learnning the global and local information sufficiently from the document. Experimental results on the CNN/Daily Mail dataset show that, the proposed method is superior to the state-of-the-art methods. At the same time, the generated summaries are more grammatical, the main ideas are more obvious and the encoding speed of the model is faster.
关 键 词:生成式摘要 编码-解码模型 语法分析 卷积-自注意力模型 注意力机制
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.195