基于Deep Q-Learning的抽取式摘要生成方法

Generation Method of Extractive Text Summarization Based on Deep Q-Learning

作　　者：王灿宇[1] 孙晓海吴叶辉季荣彪李亚东张少如杨士豪 WANG Canyu;SUN Xiaohai;WU Yehui;JI Rongbiao;LI Yadong;ZHANG Shaoru;YANG Shihao(College of Big Dated,Yunnan Agriculture University,Kunming 650201,China;Jilin Haicheng Technology Company Limited,Changchun 130033,China;College of Information Science and Technology,Northeast Normal University,Changchun 130117,China)

机构地区：[1]云南农业大学大数据学院,昆明650201 [2]吉林海诚科技有限公司,长春130033 [3]东北师范大学信息科学与技术学院,长春130117

出　　处：《吉林大学学报（信息科学版）》2023年第2期306-314,共9页Journal of Jilin University（Information Science Edition）

基　　金：吉林省科技厅基金资助项目(20220201140GX);长春市科技局基金资助项目(21ZY31)。

摘　　要：为解决训练过程中需要句子级标签的问题,提出一种基于深度强化学习的无标签抽取式摘要生成方法,将文本摘要转化为Q-learning问题,并利用DQN(Deep Q-Network)学习Q函数。为有效表示文档,利用BERT(Bidirectional Encoder Representations from Transformers)作为句子编码器,Transformer作为文档编码器。解码器充分考虑了句子的信息富集度、显著性、位置重要性以及其与当前摘要之间的冗余程度等重要性等信息。该方法在抽取摘要时不需要句子级标签,可显著减少标注工作量。实验结果表明,该方法在CNN(Cable News Network)/DailyMail数据集上取得了最高的Rouge-L(38.35)以及可比较的Rouge-1(42.07)和Rouge-2(18.32)。Extractive text summarization is a method of extracting key text fragments from the input text to serve as the summary. In order to solve the problem of requiring sentence-level labels during training, extractive text summarization is modeled as a Q-Learning problem and DQN(Deep Q-Network) to learn the Q value function. The document representation method is crucial for the quality of the generated summarization. To effectively represent the document, we adopt a hierarchical document representation method, which uses Bidirectional Encoder Representations from Transformers to obtain sentence-level vector representation and uses Transformer to obtain document-level vector representation. The decoder considers the sentence information enrichment, saliency, position, and redundancy degree between a sentence and the current summarization. This method does not require sentence-level labels when extracting sentences, which significantly reduces workload. Experiments on CNN(Cable News Network)/DailyMail data sets show that, compared with other extraction models, this model achieves the best Rouge-L(38.35) and comparable Rouge-1(42.07) and Rouge-2(18.32) performance.

关键词：抽取式文本摘要 BERT模型编码器深度强化学习

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Deep Q-Learning的抽取式摘要生成方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Deep Q-Learning的抽取式摘要生成方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索