基于层级BiGRU+Attention的面向查询的新闻多文档抽取式摘要方法  被引量:6

Query-oriented News Multi-document Extractive Summarization Method Based on Hierarchical BiGRU+Attention

在线阅读下载全文

作  者:曾昭霖 严馨[1,2] 徐广义 陈玮[1,2] 邓忠莹 ZENG Zhao-lin;YAN Xin;XU Guang-yi;CHEN Wei;DENG Zhong-ying(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Nantian Electronic Information Industry Co.,Ltd.,Kunming 650040,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]昆明理工大学云南省人工智能重点实验室,昆明650500 [3]云南南天电子信息产业股份有限公司,昆明650040

出  处:《小型微型计算机系统》2023年第1期185-192,共8页Journal of Chinese Computer Systems

基  金:国家自然科学基金地区科学基金项目(61562049,61462055)资助。

摘  要:针对现有大多数面向查询的多文档抽取式摘要方法通常是将句子的内容显著性及查询相关性分开计算的,且对向量表示的建模不充分的问题,提出一种基于层级BiGRU+Attention的面向查询的新闻多文档抽取式摘要方法.首先,通过训练层级BiGRU+Attention神经网络模型,获得具有丰富上下文语义信息的句子、文档向量表示;并在此过程中通过双线性变换注意力机制,使得文档向量表示不仅具有反映文档深层主旨信息的基本特性,还融入句子与用户查询的相关性信息,然后利用句向量与其进行相似度计算获得相应的句子重要性得分;其次,由句子重要性得分、句子中包含的关键词特征、句子的长度特征以及句子的时序权重系数加权组合得到最终的句子综合特征权重得分;最后,利用MMR算法来选择摘要句.实验结果表明,与其他方法相比本文提出的方法能在一定程度上提高面向查询的多文档抽取式摘要的质量,具有一定的有效性及优越性.Aiming at the problem that most of the existing query-oriented multi-document extractive summarization methods usually calculate the content salience and query relevance of sentences separately,and the modeling of feature vector representation is not sufficient,this paper presents a query-oriented news multi-document extractive summarization method based on hierarchical BiGRU+Attention.Firstly,the hierarchical BiGRU+Attention neural network model is trained to obtain the vector representation of sentences and documents with rich context semantic information;In this process,through bilinear transformation attention mechanism,the document vector representation not only has the basic characteristics of reflecting the deep theme information of the document,but also integrates the relevant information between the sentence and the user query;and then use the sentence vector and its similarity calculation to obtain the corresponding sentence importance score;Secondly,the final sentence comprehensive feature weight score is obtained by the weighted combination of the sentence importance score,the keyword features contained in the sentence,the length features of the sentence and the temporal weight coefficient of the sentence;Finally,the MMR algorithm is used to select summary sentences.The experimental results show that compared with other methods,the method proposed in this paper can improve the qualityof query-oriented multi-document extractive summarization to a certain extent,and has certain validity and superiority.

关 键 词:面向查询的抽取式摘要 中文多文档 层级BiGRU 注意力机制 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象