检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:朱思文 张仰森[1] 王雪松 孙龙渊 徐锐懿 贾启龙 ZHU Siwen;ZHANG Yangsen;WANG Xuesong;SUN Longyuan;XU Ruiyi;JIA Qilong(Institute of Intelligent Information Processing,Beijing Information Science and Technology University,Beijing 100101,China;College of Information Management,Beijing Information Science and Technology University,Beijing 100192,China;Beijing Municipal Audit Bureau,Beijing 100054,China)
机构地区:[1]北京信息科技大学智能信息处理研究所,北京100101 [2]北京信息科技大学信息管理学院,北京100192 [3]北京市审计局,北京100054
出 处:《哈尔滨理工大学学报》2024年第6期32-44,共13页Journal of Harbin University of Science and Technology
基 金:北京社科重点基金(21GLA007).
摘 要:针对现有摘要生成模型对审计新闻理解不充分且易丢失关键信息的问题,提出一种知识增强与生成式摘要模型相结合的摘要生成模型(text rank and bart with knowledge enhancement model,TRB-KE)。首先保留新闻前K个句子以获取关键信息;其次,利用抽取式摘要模型对剩余新闻语句按关键度排序,筛选出高质量信息;再次,建立了一套审计领域知识库,并将新闻中包含的术语与其释义抽出,作为背景知识融入到生成式摘要模型中;最后,使用生成式摘要模型对融合背景知识的高质量新闻文本进行归纳概括,得到摘要结果。同时,为提高模型效果,构建了一套审计新闻数据集进行针对性训练。实验结果表明,相较于基准模型,本文的TRB-KE模型在审计新闻数据集和NLPCC2018数据集的Rouge均值分别提升了0.98%和1.02%,证明所提模型可以学习新闻的深层信息,提升生成摘要的质量。To address the problem that existing summary generation models do not fully understand audit news and tend to lose key information,a summary generation model(text rank and bart with knowledge enhancement model,TRB-KE)that combines knowledge enhancement and generative summary model is proposed.Then,a set of audit domain knowledge base is built and the terms contained in the news are extracted with their meanings and incorporated into the generative summary model as background knowledge.Finally,the generative summary model is used to summarize the high-quality news texts with background knowledge and obtain the summary results.At the same time,a set of audit news dataset is constructed for targeted training to improve the model effect.The experimental results show that compared with the benchmark model,the proposed TRB-KE model improves the mean Rouge value by 0.98%and 1.02%in the audit news dataset and the NLPCC2018 dataset,respectively,which proves that the proposed model can learn the deep information of the news and improve the quality of the generated summary.
关 键 词:知识增强 生成式摘要模型 审计领域知识库 审计新闻摘要 审计新闻数据集
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.140.254.100