检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王嘉文 高定国 索朗曲珍[1,2] 尼琼 WANG Jia-wen;GAO Ding-guo;SUOLANG Qu-zhen;NI Qiong(School of Information Science and Technology,Tibet University,Lhasa 850000,China;Tibetan Information Technology Innovative Talent Cultivation Demonstration Base,Tibet University,Lhasa 850000,China)
机构地区:[1]西藏大学信息科学技术学院,拉萨850000 [2]西藏大学藏文信息技术创新人才培养示范基地,拉萨850000
出 处:《科学技术与工程》2024年第24期10348-10355,共8页Science Technology and Engineering
基 金:国家自然科学基金(62166038);四川省科技计划基金(2023YFQ0044);西藏大学高水平人才培养计划项目(2021-GSP-S126)。
摘 要:跨语种语音识别是一种利用多种源语言的数据来训练一个能够识别目标语言的语音识别系统,它可以促进不同语言和文化之间的交流和理解。为解决跨语种语音识别存在着如何利用多语种数据来提高低资源语言的识别性能,源语言和目标语言之间的领域偏移或干扰,不同语言之间的任务权重和数据分布等问题,通过特征提示的方法研究跨语种语音识别模型;为简化传统需要专业人员对音素进行统一标注的过程,通过对原数据标识对应语种的方法研究跨语种语音数据标注方式,在2个公开数据集上进行实验。结果表明:所提模型相比于目前主流的语音识别模型Conformer模型平均错误率降低46.44%,相比于基线模型平均错误率降低2.1%,达到较高的识别准确率。研究成果为跨语种语音识别领域提供了新的思路和方法。Cross-lingual speech recognition leverages data from a variety of source languages to train systems capable of identifying speech in a target language,thus promoting intercultural communication and understanding.To address the issues of how to utilize multilingual data to improve the recognition performance of low resource languages in cross-lingual speech recognition,domain shift or interference between source and target languages,task weights and data distribution between different languages,a cross lingual speech recognition model was studied through feature prompts.To simplify the traditional process of requiring professionals to label phonemes uniformly,a cross-lingual speech data annotation method was studied by identifying the corresponding language in the original data,and experiments were conducted on two public datasets.The results show that the proposed model achieves a substantial reduction in the average error rate 46.44%lower than the Conformer model,a mainstream speech recognition model,and 2.1%lower than the baseline model,thereby attaining higher accuracy in recognition.The research results provide novel perspectives and methodologies for the domain of cross-lingual speech recognition.
关 键 词:特征提示 跨语种 语音识别 CONFORMER Contextnet
分 类 号:TN912.3[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:52.15.179.198