基于SPWD时频脊特征提取的汉语声调识别  被引量:3

CHINESE TONE RECOGNITION BASED ON SPWD TIME-FREQUENCY RIDGE FEATURE EXTRACTION

在线阅读下载全文

作  者:徐郑丹 于凤芹[1] 

机构地区:[1]江南大学物联网工程学院,江苏无锡214122

出  处:《计算机应用与软件》2014年第3期142-145,共4页Computer Applications and Software

基  金:国家自然科学基金项目(61075008)

摘  要:针对语音信号的非平稳性,采用SPWD(smoothed pseudo Wigner-Ville distribution)将韵母语音信号在时频面清晰地表现出来。不同声调语音的时频脊的变化特征不同。利用阈值和细化处理将SPWD时频矩阵转变成二值矩阵图像,利用Hough变换提取脊线;而第三声时频脊是曲线,将Hough变换求取的线段用最小二乘法多项式进行拟合;在脊线段上等间距选取若干个点,将点集和其一阶差分作为时频脊特征,利用高斯混合模型进行识别分类。仿真实验结果表明,该方法很好地对声调进行了识别,平均识别率为86.48%,第二声识别率提高的幅度最大,提高了5.18%;在不同的信噪比下,识别率最大可提高5.62%。For the non-stability of speech signals, we use SPWD to clearly manifest the vowel speech signals on time-frequency plane. The variation features of time-frequency ridges differ from different speech tones. We use threshold and refined processing to convert SPWD time-frequency matrix to a binary matrix image, and use Hough transform to extract ridge lines. But the time-frequency ridge of the third tone is curve, the line segment obtained by Hough transform is fitted with the method of least squares polynomial; We select some points equidistantly on time-frequency ridge line, and use the point set and its first difference as the feature of the time-frequency ridge, then use Gaussian mixture model (GMM) to conduct recognition and classification. Simulation experimental results show that this method is very good to the tone recognition and its average recognition rate is 86.48%. The improvement extent of the second tone' s recognition rate is the highest, as high as5.18%. And in different SNR, the maximum improvement of recognition rate reaches 5.62%.

关 键 词:声调识别 平滑伪Wigner—Ville分布 时频脊 HOUGH变换 最小二乘法多项式拟合 

分 类 号:TN912.3[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象