基于CNN-BiGRU的方言语种识别  被引量:3

Dialect Language Recognition Based on CNN-BiGRU

在线阅读下载全文

作  者:付英 刘增力[1] 汤辉 FU Ying;LIU Zengli;TANG Hui(Kunming University of Science and Technology,Kunming Yunnan 650504,China;Jiangxi Computing Center,Nanchang Jiangxi 330003,China)

机构地区:[1]昆明理工大学,云南昆明650504 [2]江西省科技基础条件平台中心,江西南昌330003

出  处:《通信技术》2022年第6期712-719,共8页Communications Technology

基  金:国家自然科学基金项目(61271007)。

摘  要:针对方言特征表征能力差和识别率低的问题,兼顾特征提取和模型改进两方面对不同时长的方言语种数据进行实验仿真。首先,通过对比不同的特征提取算法,确定模型的最佳输入特征;其次,使用焦点损失代替交叉熵损失函数,对不均衡和相似度高的方言语种分配不同的权重,经实验仿真确定最优参数使模型性能达到最佳;再次,对比不同的模型在不同时长方言语种中的识别性能,实验结果显示,与基线系统相比,提出的改进模型平均识别率提升了4.09%;最后,采用语音增强方式提高模型的泛化能力和鲁棒性。Aiming at the problem of poor ability to represent dialect features and low recognition rate,this paper takes into account both feature extraction and model improvement to conduct experimental simulations on dialect language data of different durations.Firstly,the optimal input features of the model are determined through the comparison of different feature extraction algorithms.Secondly,the focal loss is used instead of the cross entropy loss function to assign different weights to the dialect languages with imbalance and high similarity,and the optimal parameters are determined by experimental simulation to optimize the performance of the model.Then,the recognition performance of different models in different dialects of different time lengths is compared.Experimental results indicate that the improved model proposed in this paper improves the average recognition rate by 4.09% compared with the baseline system.Finally,the speech enhancement is used to improve the generalization ability and robustness of the model.

关 键 词:方言语种识别 焦点损失 模型改进 语音增强 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象