检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:成悦 赵康 勾智楠 高凯[1] CHENG Yue;ZHAO Kang;GOU Zhinan;GAO Kai(School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang,Hebei 050018,China;School of Information Technology,Hebei University of Economics and Business,Shijiazhuang,Hebei 050061,China)
机构地区:[1]河北科技大学信息科学与工程学院,河北石家庄050018 [2]河北经贸大学信息技术学院,河北石家庄050061
出 处:《河北科技大学学报》2022年第6期594-601,共8页Journal of Hebei University of Science and Technology
基 金:河北省自然科学基金(F2022208006);河北省高等学校科学技术研究项目(QN2020198)。
摘 要:当前大部分的抽取式摘要方法主要关注对摘要句的表示和抽取,容易忽略对文本特征表示的充分性。为了解决这一问题,提出一种基于度量学习和层级推理网络的抽取式摘要方法。首先,在抽取式任务基础上提出基于度量学习和层级推理的抽取式摘要模型(MLHIN);其次,在CNN/DailyMail数据集上进行模型评估,并在英文摘要数据集CNN/DailyMail上进行测试;最后,对测试结果进行验证。结果显示,所提方法模型在Rouge-1,Rouge-2,Rouge-L上的得分明显优于其他模型,比Lead-3模型分别高出0.84%,1.29%和2.43%;通过将提出的度量损失metric和层级推理模型中的句子编码器替换掉,可以看出模型性能均有不同程度的下降,证明了提出的层级推理网络和度量损失的有效性。新算法能够提高模型捕捉长距离依赖的能力,增强模型对摘要句与非摘要句的分辨力,有效改善了抽取式摘要方法的性能。Most of the current extractive summarization methods mainly focus on the representation and extraction of summary sentences, and tend to ignore the adequacy of text feature representation.In order to solve this problem, an extractive summarization method was proposed.Firstly, on the basis of abstract tasks, an extractive summarization model(MLHIN) based on metric learning and hierarchical inference was proposed.Secondly, the model was evaluated and tested on the English CNN/DailyMail dataset.Finally, the test results of the model on the dataset are verified.The results show that the proposed model has significantly higher scores than other models on Rouge-1,Rouge-2 and Rouge-L,which are 0.84%,1.29% and 2.43% higher than the Lead-3 model respectively.After replacing the metric loss metric and the sentence encoder with other modules, it can be seen that the performance of the model has declined to varying degrees, which proves the effectiveness of the proposed hierarchical inference network and metric loss.The algorithm can improve the ability of model to capture long-distance dependency, enhance the ability of model to distinguish summary sentences from non-summary sentences, and effectively improve the performance of the extractive summarization methods.
关 键 词:自然语言处理 句子编码器 文档编码器 度量学习 层级推理 抽取式文本摘要
分 类 号:TN958.98[电子电信—信号与信息处理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.90