TDNN模型对电话录音场景的识别研究  被引量:1

Building TDNN Acoustic Model for Scene of Telephone Recording Speech Recognition

在线阅读下载全文

作  者:孔玲军 KONG Lingjun(Binhai College,Nankai University,Tianjin,China,300270)

机构地区:[1]南开大学滨海学院,天津300270

出  处:《福建电脑》2022年第4期50-52,共3页Journal of Fujian Computer

摘  要:近几年,延时神经网络TDNN模型在语音识别领域取得了非常好的效果。由于其具有权值共享和子采样等特点,使其减少了参数的训练规模。本文将3000小时汉语电话录音语料通过TDNN模型训练声学参数。在3000小时内的随机测试集上,TDNN的误识率比DNN降低了0.62%至1.18%。但是,在3000小时外的测试集上,DNN与TDNN的效果各有优劣,但都表现得较为稳定。The delayed neural network TDNN model has achieved very good results in the field of speech recognition.Because it has the characteristics of weight sharing and sub sampling,the training scale of parameters is reduced.This paper uses TDNN model to train the acoustic parameters of 3000 hours of Chinese telephone recording corpus.On the random test set within 3000 hours,the error rate of TDNN is reduced by 0.62%to 1.18%compared with DNN.However,in the test set beyond 3000 hours,the effects of DNN and TDNN have their own advantages and disadvantages,but they are relatively stable.

关 键 词:DNN TDNN 汉语电话录音 Kaldi 子采样 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象