基于对比学习思想的多跳问题生成  

Multi-hop Question Generation Based on Contrastive Learning Ideas

在线阅读下载全文

作  者:王红斌[1,2,3] 杨何祯旻[1,2,3] 王灿宇 WANG Hongbin;YANG Hezhenmin;WANG Canyu(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Computer Technology Application,Kunming University of Science and Technology,Kunming 650500,China;Faculty of Big Data,Yunnan Agricultural University,Kunming 650201,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]昆明理工大学云南省人工智能重点实验室,昆明650500 [3]昆明理工大学云南省计算机技术应用重点实验室,昆明650500 [4]云南农业大学大数据学院,昆明650201

出  处:《吉林大学学报(理学版)》2023年第5期1103-1111,共9页Journal of Jilin University:Science Edition

基  金:国家自然科学基金(批准号:61966020);云南省基础研究计划面上项目(批准号:CB22052C143A)。

摘  要:针对获取大规模的多跳问答训练数据集耗时耗力的问题,提出一个基于对比学习思想的多跳问题生成模型.模型分为生成阶段和对比学习打分阶段,生成阶段通过执行推理图生成候选多跳问题,对比学习打分阶段通过一个基于对比学习思想的无参考问题的候选问题打分模型对候选问题进行打分排序,并选择最优的候选问题.该模型在一定程度上缩小了无监督方法与人工标注方法的差距,有效缓解了缺少多跳问答数据集的问题.在数据集HotpotQA上的实验结果表明,基于对比学习的多跳问题生成模型能有效扩充训练数据,极大减少了人工标注数据的成本.Aiming at the time-consuming and labor-intensive problem of obtaining large-scale multi-hop question and answer training dataset,we proposed a multi-hop question generation model based on the contrastive learning idea.The model was divided into the generation phase and the contrastive learning scoring phase.In the generation phase,candidate multi-hop questions were generated by executing the inference graph.In the contrastive learning scoring phase,candidate questions were scored and sorted through a candidate question scoring model without reference question based on the contrastive learning idea,and the best candidate question was selected.This model had to some extent narrowed the gap between unsupervised methods and manual annotation methods,effectively alleviating the problem of lacking a multi-hop question and answer dataset.The experimental results on HotpotQA dataset show that the multi-hop question generation model based on contrastive learning can effectively expand the training data and greatly reduce the cost of manually labeling data.

关 键 词:多跳问题生成 机器阅读理解 对比学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象