抽取-生成式自动文本摘要技术研究综述  被引量:2

Review of Research on Extractive-abstractive Automatic Text Summarization Technology

在线阅读下载全文

作  者:刘迪 奚雪峰 崔志明 盛胜利 LIU Di;XI Xue-feng;CUI Zhi-ming;SHENG Sheng-li(School of Electronic Information and Engineering,Suzhou University of Science and Technology,Suzhou 215000,China;Suzhou Key Laboratory of Virtual Reality Intelligent Interaction and Application Technology,Suzhou 215000,China;Suzhou Smart City Research Institute,Suzhou 215000,China;Texas Institute of Technology,Lubbock 79401,USA)

机构地区:[1]苏州科技大学电子信息与工程学院,江苏苏州215000 [2]苏州市虚拟现实智能交互及应用重点实验室,江苏苏州215000 [3]苏州智慧城市研究院,江苏苏州215000 [4]德州理工大学,得克萨斯州拉伯克市79401

出  处:《计算机技术与发展》2023年第5期1-8,共8页Computer Technology and Development

基  金:国家自然科学基金(61876217,62176175);江苏省“六大人才高峰”高层次人才项目资助(XYDXX-086)。

摘  要:自动文本摘要技术是一项利用计算机按照某类应用自动地将文本或文本集合转换成简短摘要的信息压缩技术。在当前互联网的快速发展背景下,涌现出大量复杂的信息,导致人工无法精准捕捉有效的信息。为此,在本着更准确、更便捷、更高效地收集信息为目的的前提下,利用自然语言处理中自动文本摘要技术处理复杂文本的优势将显得格外突出。随着抽取式摘要技术和生成式摘要技术的发展成熟,抽取-生成式摘要技术逐渐兴起。以技术分析为干线,对抽取-生成式摘要技术进行综述。首先,介绍了抽取-生成式摘要技术中的评价方法以及常用中英文数据集;其次,通过实例分析六类主流技术方法并对比其优缺点:基于强化学习的方法、基于信息论的方法、基于指针网络的方法、基于序列标注的方法、基于预训练的方法、基于联合注意力的方法;最后,总结了抽取-生成式摘要技术面临的挑战并展望了抽取-生成式摘要技术未来的发展方向。Automatic text summarization is an information compression technology that automatically converts text or text collection into short summarization by computer according to some kind of application.In the context of the rapid development of the current Internet,a large number of complex information has emerged,resulting in manual cannot accurately capture effective information.Therefore,in order to collect information more accurately,conveniently and efficiently,the advantage of using automatic text summarization technology in natural language processing to deal with complex texts will be particularly prominent.With the development and maturity of extractive summarization technology and abstractive summarization technology,extractive-abstractive summarization technology has emerged.Taking technical analysis as the main line,the extractive-abstractive summarization technology is reviewed.Firstly,we introduce the evaluation method of extractive-abstractive summarization technology and the commonly used Chinese and English data sets.Secondly,six mainstream technical methods are analyzed through examples and their advantages and disadvantages are compared,including methods based on reinforcement learning,methods based on information theory,methods based on pointer network,methods based on sequence labeling,methods based on pre-training,and methods based on joint attention.Finally,the challenges faced by extractive-abstractive summarization are summarized and the future development of extractive-abstractive summarization is prospected.

关 键 词:自然语言处理 自动文本摘要 抽取-生成式 评价方法 数据集 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象