检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘涵 古丽拉·阿东别克[1,2,3] 于迎霞 马雅静[1,2,3] LIU Han;ALTENBEK Culi;YU Yingxia;MA Yajing(Xinjiang University,Urumqi 830046,China;The Base of Kazakh and Kirghiz Language of National Language Resources Monitoring and Research Center for Minority Languages,Urumqi 830046,China;Xinjiang Laboratory of Multi-language Information Technology,Urumqi 830046,China)
机构地区:[1]新疆大学计算机科学与技术学院,乌鲁木齐830046 [2]国家语言资源监测与研究少数民族语言中心哈萨克和柯尔克孜语文基地,乌鲁木齐830046 [3]新疆多语种信息技术实验室,乌鲁木齐830046
出 处:《中央民族大学学报(自然科学版)》2024年第4期20-29,56,共11页Journal of Minzu University of China(Natural Sciences Edition)
摘 要:问句理解的目标是识别给定话语的潜在意图,并在问答系统中提取所有相关槽位标签。传统方法多使用单一语言语料库构建联合任务模型,忽略了实际场景中用户查询通常是多语言和多样化的事实,因而缺乏能够有效支持多语言联合意图识别和槽位填充的方法。本文提出了一种问句理解联合模型——跨语言双向传播模型(Cross-lingual Bi-directional Propagation Model,XBPM),能够有效处理跨语言意图识别和槽位填充联合任务,其重点是增强模型在多语言场景,特别是中国少数民族语言的识别性能。模型基于跨语言预训练模型的意图识别和槽位填充任务之间的双向连接,赋予其强大的跨语言迁移能力。为了解决少数民族语言语料稀缺问题,本文构建了包括16 548个汉语数据和1 399个哈萨克语数据的跨语言旅游问句数据集(Cross-lingual Tourism Field Question Dataset, XTFQD),为跨语言意图识别和槽位填充联合任务提供了新的训练和评估语料。在公共跨语言问句理解联合数据集MTOD(Multilingual Task Oriented Dialog, MTOD)和跨语言旅游问句数据集XTFQD上进行的对比实验和消融实验表明,与基线模型相比,XBPM模型在单语语料和跨语言场景下都表现出了显著的性能改进,验证了模型的有效性。The goal of question understanding is to identify the underlying intent of a given utterance and extract all relevant slot labels in a question-answering system.Most traditional methods construct joint task models using a single-language corpus,disregarding the fact that user queries in real-world scenarios are often multilingual and diverse.Therefore,the current state-of-the-art methods lack ef-fective approaches to support multilingual joint intent detection and slot filling.In this paper,we propose a novel question understanding joint model called the Cross-lingual Bi-directional Propagation Model(XBPM),which focuses on enhancing the recognition performance of the model in multilingual scenarios,particularly in the context of Chinese ethnic minority languages.The proposed model leverages bi-directional connections between intent detection and slot filling tasks based on cross-lingual pre-training models,endowing it with strong cross-lingual transferability.Additionally,we construct a multilingual question understanding joint task corpus called XTFQD,which includes utterances in the tourism domain in both Chinese and Kazakh languages,addressing the data scarcity issue in multilingual question understanding joint tasks for ethnic minority languages.Comparative experimental results demonstrate that our model outperforms traditional joint models in terms of cross-lingual transfer performance.Further ablation experiments confirm the effectiveness of the proposed approach.
分 类 号:TP3-05[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15