检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙宝山[1,2] 谭浩 SUN Baoshan;TAN Hao(School of Computer Science and Technology,Tiangong University,Tianjin 300387,China;Tianjin Key Laboratory of Autonomous Intelligence Technology and Systems,Tiangong University,Tianjin 300387,China)
机构地区:[1]天津工业大学计算机科学与技术学院,天津300387 [2]天津市自主智能技术与系统重点实验室,天津300387
出 处:《计算机工程与应用》2022年第15期184-190,共7页Computer Engineering and Applications
基 金:国家自然科学基金(61972456,61173032);天津市自然科学基金(20JCYBJC00140);泛网无线通信教育部重点实验室(BUPT)开放课题(KFKT-2020101)。
摘 要:任务中的生成式摘要模型对原文理解不充分且容易生成重复文本等问题,提出将词向量模型ALBERT与统一预训练模型UniLM相结合的算法,构造出一种ALBERT-UniLM摘要生成模型。该模型采用预训练动态词向量ALBERT替代传统的BERT基准模型进行特征提取获得词向量。利用融合指针网络的UniLM语言模型对下游生成任务微调,结合覆盖机制来降低重复词的生成并获取摘要文本。实验以ROUGE评测值作为评价指标,在2018年CCF国际自然语言处理与中文计算会议(NLPC-C2018)单文档中文新闻摘要评价数据集上进行验证。与BERT基准模型相比,ALBERT-UniLM模型的Rouge-1、Rouge-2和Rouge-L指标分别提升了1.57%、1.37%和1.60%。实验结果表明,提出的ALBERT-UniLM模型在文本摘要任务上效果明显优于其他基准模型,能够有效提高文本摘要的生成质量。Aiming at the problem that the generative summary model in the text summarization task does not fully under-stand the original text and is easy to generate repeated texts,an algorithm combining the dynamic word vector model ALBERT and the unified pre-training model UniLM is proposed to construct an ALBERT-UniLM summary generate the model.The model first uses the pre-trained dynamic word vector ALBERT to replace the traditional BERT benchmark model for feature extraction to obtain the word vector.Then the UniLM language model of the fusion pointer network is used to fine-tune the downstream generation tasks,and the coverage mechanism is combined to reduce the generation of repetitive content and obtain the summary text.The experimental result uses the ROUGE evaluation value as the evaluation indicator.It is verified on the 2018 CCF International Natural Language Processing and Chinese Computing Conference(NLPCC2018)single-document Chinese news summary evaluation data set.Compared with the BERT benchmark model,the Rouge of the ALBERT-UniLM model Rouge-1,Rouge-2 and Rouge-L indicators increased by 1.57%,1.37%and 1.60%respectively.Experimental results show that the ALBERT-UniLM model proposed in the article is significantly better than other benchmark models on text summarization tasks,and can effectively improve the quality of text summarization generation.
关 键 词:自然语言处理 预训练语言模型 ALBERT模型 UniLM模型 生成式摘要
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222