结合主题和位置信息的两阶段文本摘要模型  

A two-stage text summarization model combining topic and position information

在线阅读下载全文

作  者:任淑霞[1] 张靖 赵宗现 饶冬章 REN Shuxia;ZHANG Jing;ZHAO Zongxian;RAO Dongzhang(School of Software,Tianjin University,Tianjin 300387,China;School of Computer Science and Technology,Tianjin University,Tianjin 300387,China)

机构地区:[1]天津工业大学软件学院,天津300387 [2]天津工业大学计算机科学与技术学院,天津300387

出  处:《智能计算机与应用》2023年第9期158-163,共6页Intelligent Computer and Applications

基  金:天津市自然科学基金(19JCYBJC18700)。

摘  要:预训练模型BERT显著提升了文本摘要领域模型的性能,但是其在探索文档全局语义方面和对句子位置信息的利用方面还存在着不足。为了解决以上问题,本文提出了一种结合双主题嵌入和句子绝对位置嵌入的两阶段自动摘要生成模型。首先,文章在两个阶段分别引入主题嵌入,融合了丰富的语义特征,以捕获更准确的全局语义;其次,在抽取式阶段引入句子绝对位置嵌入将句子位置信息进行完全整合,获得更全面的摘要抽取辅助信息以进行摘要提取;在此基础上,模型采用抽取-生成两阶段混合式摘要框架,通过抽取阶段对文本重要信息的提取降低生成摘要内容的冗余性,并进一步提高了模型的性能。在CNN/Daily Mail数据集上实验结果表明,本文提出的模型取得了较好的结果。The pre-trained model BERT can significantly improve the performance of the text summarization task,but it still has some shortcomings in exploring the global semantics of documents and using the sentence position information.In order to solve the above problems,this paper proposes a two-stage automatic summary generation model which combines two-topic embedding and sentence absolute position embedding.Firstly,topic embedding is introduced in two stages,which integrate rich semantic features to capture more accurate global semantics.Secondly,absolute position embedding is introduced in the extraction stage to fully integrate the sentence position information,so as to obtain more comprehensive auxiliary information for abstract extraction.On this basis,the model adopts the extractive-abstractive two-stage hybrid summary framework,which reduces the redundancy of the generated summary content by extracting the important information of the text in the extraction stage and further improves the performance of the model.Experimental results show that the proposed model achieves good results on CNN/Daily Mail dataset.

关 键 词:混合式摘要 BERT 双主题嵌入 句子绝对位置嵌入 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象