基于深度学习的文本自动摘要方案  被引量:11

Automatic text summarization scheme based on deep learning

在线阅读下载全文

作  者:张克君[1,2] 李伟男 钱榕 史泰猛[1] 焦萌 ZHANG Kejun;LI Weinan;QIAN Rong;SHI Taimeng;JIAO Meng(Department of Computer Science and Technology,Beijing Electronic Science and Technology Institute,Beijing 100070,China;School of Computer Science and Technology,Xidian University,Xi’an Shaanxi 710071,China)

机构地区:[1]北京电子科技学院计算机科学与技术系,北京100070 [2]西安电子科技大学计算机科学与技术学院,西安710071

出  处:《计算机应用》2019年第2期311-315,共5页journal of Computer Applications

基  金:国家重点研发计划项目(2018YFB1004101)~~

摘  要:针对自然语言处理(NLP)生成式自动摘要领域的语义理解不充分、摘要语句不通顺和摘要准确度不够高的问题,提出了一种新的生成式自动摘要解决方案,包括一种改进的词向量生成技术和一个生成式自动摘要模型。改进的词向量生成技术以Skip-Gram方法生成的词向量为基础,结合摘要的特点,引入词性、词频和逆文本频率三个词特征,有效地提高了词语的理解;而提出的Bi-MulRnn+生成式自动摘要模型以序列映射(seq2seq)与自编码器结构为基础,引入注意力机制、门控循环单元(GRU)结构、双向循环神经网络(BiRnn)、多层循环神经网络(MultiRnn)和集束搜索,提高了生成式摘要准确性与语句流畅度。基于大规模中文短文本摘要(LCSTS)数据集的实验结果表明,该方案能够有效地解决短文本生成式摘要问题,并在Rouge标准评价体系中表现良好,提高了摘要准确性与语句流畅度。Aiming at the problems of inadequate semantic understanding,improper summary sentences and inaccurate summary in the field of Natural Language Processing(NLP)abstractive automatic summarization,a new automatic summary solution was proposed,including an improved word vector generation technique and an abstractive automatic summarization model.The improved word vector generation technology was based on the word vector generated by the skip-gram method.Combining with the characteristics of abstract,three word features including part of speech,word frequency and inverse text frequency were introduced,which effectively improved the understanding of words.The proposed Bi-MulRnn+abstractive automatic summarization model was based on sequence-to-sequence(seq2seq)framework and self-encoder structure.By introducing attention mechanism,Gated Recurrent Unit(GRU)gate structure,Bi-directional Recurrent Neural Network(BiRnn)and Multi-layer Recurrent Neural Network(MultiRnn),the model improved the summary accuracy and sentence fluency of abstractive summarization.The experimental results of Large-Scale Chinese Short Text Summarization(LCSTS)dataset show that the proposed scheme can effectively solve the problem of abstractive summarization of short text,and has good performance in Rouge standard evaluation system,improving summary accuracy and sentence fluency.

关 键 词:自然语言处理 生成式文本自动摘要 序列映射 自编码器 词向量 循环神经网络 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] TP391.1[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象