基于分层最大边缘相关的柬语多文档抽取式摘要方法  被引量:1

Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance

在线阅读下载全文

作  者:曾昭霖 严馨[1,2] 余兵兵 周枫[1,2] 徐广义 ZENG Zhaolin;YAN Xin;YU Bingbing;ZHOU Feng;XU Guangyi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming,Yunnan 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming,Yunnan 650500,China;Yunnan Nantian Electronic Information Industry Company Limited,Kunming,Yunnan 650040,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500 [2]昆明理工大学云南省人工智能重点实验室,云南昆明650500 [3]云南南天电子信息产业股份有限公司,云南昆明650040

出  处:《河北科技大学学报》2020年第6期508-517,共10页Journal of Hebei University of Science and Technology

基  金:国家自然科学基金(61562049,61462055)。

摘  要:为了解决传统多文档抽取式摘要方法无法有效利用文档之间的语义信息、摘要结果存在过多冗余内容的问题,提出了一种基于分层最大边缘相关的柬语多文档抽取式摘要方法。首先,将柬语多文档文本输入到训练好的深度学习模型中,抽取得到所有的单文档摘要;然后,依据类似分层瀑布的方式,迭代合并所有的单文档摘要,通过改进的最大边缘相关算法合理地选择摘要句,得到最终的多文档摘要。结果表明,与其他方法相比,通过使用深度学习方法并结合分层最大边缘相关算法共同获得的柬语多文档摘要,R1,R2,R3和RL值分别提高了4.31%,5.33%,6.45%和4.26%。基于分层最大边缘相关的柬语多文档抽取式摘要方法在保证摘要句子多样性和差异性的同时,有效提高了柬语多文档摘要的质量。In order to solve the problem of ineffective utilization of the semantic information between documents in the traditional multi-document extractive summarization method and the excessive redundant content in the summary result,a Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance(MMR)was proposed.Firstly,the Khmer multi-document text was input into the trained deep learning model to extract all the single-document summaries.Then,all single document summaries were iteratively merged according to a similar hierarchical waterfall method,and the improved MMR algorithm was used to reasonably select summary sentences to obtain the final multi-document summary.The experimental results show that the R1,R2,R3,RL values of the Khmer multi-document summary obtained by using the deep learning method combined with the hierarchical MMR algorithm increases by 4.31%,5.33%,6.45%and 4.26%respectively compared with other methods.The Khmer multi-document extractive summarization method based on hierarchical MMR can effectively improve the quality of Khmer multi-document summary while ensuring the diversity and difference of the summary sentences.

关 键 词:多文档摘要 文本输入 语义信息 最大边缘相关 深度学习 多冗余 抽取式 多样性 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象