检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王嘉文 高定国 索朗曲珍[1,2] WANG Jiawen;GAO Dingguo;SUOLANG Quzhen(School of Information Science and Technology,Tibet University,Lhasa 850000,China;Tibetan Information Technology Innovative Talent Cultivation Demonstration Base,Tibet University,Lhasa 850000,China)
机构地区:[1]西藏大学信息科学技术学院,拉萨850000 [2]西藏大学藏文信息技术创新人才培养示范基地,拉萨850000
出 处:《应用声学》2025年第2期405-412,共8页Journal of Applied Acoustics
基 金:国家自然科学基金项目(62166038);四川省科技计划项目(2023YFQ0044);西藏大学研究生“高水平人才培养计划”项目(2021-GSP-S126)。
摘 要:语声识别建模单元的选择是藏语语声识别任务中的关键问题,决定了语声识别声学模型的训练质量和识别准确性。针对藏语语声识别研究中多种建模单元在不同数据集上进行的实验,导致难以探寻合适建模单元进行藏语语声识别,使得相关科研成果难以相互支持的问题,该文提出了适用性更高同时识别效果更优秀的藏语语声识别声学模型建模单元。该文总结改进了4种建模单元,并在3种方言数据上进行了消融实验,分别训练了5种声学模型。实验结果表明,基于拉丁音素的建模单元适用于卫藏方言和康巴方言,基于拉丁音节的建模单元适用于安多方言,改进的基于注意力机制的深度卷积声学模型在安多方言上达到了最好的识别效果,测试集字错误率为14.67%。The choice of speech recognition modeling units is a key issue in Tibetan speech recognition tasks,which determines the training quality and recognition accuracy of the speech recognition acoustic model.In view of the problem that the experiments of various modeling units in Tibetan speech recognition research on different data sets make it difficult to explore the suitable modeling units for Tibetan speech recognition,and make the related scientific research results difficult to support each other,this paper proposes a more applicable and better recognition effect Tibetan speech recognition acoustic model modeling unit.This paper summarizes and improves four modeling units,and conducts ablation experiments on three dialect data,and trains five acoustic models respectively.The experimental results show that the modeling unit based on Latin phonemes is suitable for Lhasa and Khams dialects,the modeling unit based on Latin syllables is suitable for Ambo dialects,and the improved deep convolutional acoustic model based on attention mechanism achieves the best recognition effect on Ambo dialects,with a character error rate of 14.67%on the test set.
分 类 号:TN912.3[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49