检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:许亮 张春[1] 张宁[1] 田雪涛 XU Liang;ZHANG Chun;ZHANG Ning;TIAN Xuetao(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)
机构地区:[1]北京交通大学计算机与信息技术学院,北京100044
出 处:《计算机应用》2023年第12期3668-3675,共8页journal of Computer Applications
基 金:国家重点研发计划项目(2019YFB1405202)。
摘 要:Prompt范式被广泛应用于零样本的自然语言处理(NLP)任务中,但是现有基于Prompt范式的零样本关系抽取(RE)模型存在答案空间映射难构造与模板选择依赖人工的问题,无法取得较好的效果。针对这些问题,提出一种融合多Prompt模板的零样本RE模型。首先,将零样本RE任务定义为掩码语言模型(MLM)任务,舍弃答案空间映射的构造,将模板输出的词与关系描述文本在词向量空间中进行比较,以此判断关系类别;其次,引入待抽取关系类别的描述文本的词性作为特征,学习该特征与各个模板之间的权重;最后,利用该权重融合多个模板输出的结果,以此减少人工选取的Prompt模板引起的性能损失。在FewRel(Few-shot Relation extraction dataset)和TACRED(Text Analysis Conference Relation Extraction Dataset)这两个数据集上的实验结果显示,与目前最优的模型RelationPrompt相比,所提模型在不同数据资源设置下,F1值分别提升了1.48~19.84个百分点和15.27~15.75个百分点。可见,所提模型在零样本RE任务上取得了显著的效果提升。Prompt paradigm is widely used to zero-shot Natural Language Processing(NLP)tasks.However,the existing zero-shot Relation Extraction(RE)model based on Prompt paradigm suffers from the difficulty of constructing answer space mappings and dependence on manual template selection,which leads to suboptimal performance.To address these issues,a zero-shot RE model via multi-template fusion in Prompt was proposed.Firstly,the zero-shot RE task was defined as the Masked Language Model(MLM)task,where the construction of answer space mapping was abandoned.Instead,the words output by the template were compared with the relation description text in the word embedding space to determine the relation class.Then,the part of speech of the relation description text was introduced as a feature,and the weight between this feature and each template was learned.Finally,this weight was utilized to fuse the results output by multiple templates,thereby reducing the performance loss caused by the manual selection of Prompt templates.Experimental results on FewRel(Few-shot Relation extraction dataset)and TACRED(Text Analysis Conference Relation Extraction Dataset)show that,the proposed model significantly outperforms the current state-of-the-art model,RelationPrompt,in terms of F1 score under different data resource settings,with an increase of 1.48 to 19.84 percentage points and 15.27 to 15.75 percentage points,respectively.These results convincingly demonstrate the effectiveness of the proposed model for zero-shot RE tasks.
关 键 词:关系抽取 信息抽取 零样本学习 Prompt范式 预训练语言模型
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171