低信噪比下基于B-Wave-U-Net特征增强的音素识别  

Phoneme Recognition Based on B-Wave-U-Net Feature Enhancement at Low Signal-to-Noise Ratio

在线阅读下载全文

作  者:黄辉波 邵玉斌[1] 龙华[1] 杜庆治[1] HUANG Huibo;SHAO Yubin;LONG Hua;DU Qingzhi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500

出  处:《北京邮电大学学报》2025年第1期100-106,共7页Journal of Beijing University of Posts and Telecommunications

基  金:云南省媒体融合重点实验室项目(220235205)。

摘  要:针对低信噪比下音素识别准确率低的问题,提出一种基于B-Wave-U-Net特征增强的音素识别方法。首先,将双向长短期记忆(BLSTM)网络融入Wave-U-Net编码器的起始端,并从中引出支路信息流,再跳跃连接到解码器的末端,加入全连接层,从而构建出B-Wave-U-Net;接着,使用B-Wave-U-Net对语谱图增强、去噪;最后经过梅尔滤波,得到对数梅尔尺度滤波器组能量特征。在信噪比为0 dB,噪声源为白噪声的条件下,采用THCHS30数据集和ResNet-BLSTM-CTC模型进行音素识别测试。结果表明,所提B-Wave-U-Net优于对比网络,音素错误率降低了0.9%~2.5%。验证了在音素识别下的噪声鲁棒性特征提取上,B-Wave-U-Net发挥了重要的优势。To address the issue of low phoneme recognition accuracy at low signal-to-noise ratios(SNR),a phoneme recognition method is proposed based on B-Wave-U-Net feature enhancement.First,a bidirectional long short-term memory(BLSTM)network is integrated at the beginning side of the Wave-U-Net encoder,from where the information flow is extracted and jump-connected to the decoder side.Then it will be inserted into a fully connected layer to form the B-Wave-U-Net network.The next speech spectrogram is then enhanced and denoised using the B-Wave-U-Net.Finally,Mel filtering is applied to extract the log-Mel scale bank energy features.Phoneme recognition tests are conducted under 0 dB SNR with a white noise source,using the THCHS30 dataset and the ResNet-BLSTM-CTC model.Experimental results show that the proposed B-Wave-U-Net outperforms the baseline network,reducing the phoneme error rate by 0.9%to 2.5%.This demonstrates the significant advantage of the B-Wave-U-Net in robust feature extraction for phoneme recognition under noisy conditions.

关 键 词:音素识别 对数梅尔尺度滤波器组能量 Wave-U-Net 双向长短期记忆 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象