检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]江南大学物联网工程学院,江苏无锡214122
出 处:《计算机应用与软件》2014年第3期142-145,共4页Computer Applications and Software
基 金:国家自然科学基金项目(61075008)
摘 要:针对语音信号的非平稳性,采用SPWD(smoothed pseudo Wigner-Ville distribution)将韵母语音信号在时频面清晰地表现出来。不同声调语音的时频脊的变化特征不同。利用阈值和细化处理将SPWD时频矩阵转变成二值矩阵图像,利用Hough变换提取脊线;而第三声时频脊是曲线,将Hough变换求取的线段用最小二乘法多项式进行拟合;在脊线段上等间距选取若干个点,将点集和其一阶差分作为时频脊特征,利用高斯混合模型进行识别分类。仿真实验结果表明,该方法很好地对声调进行了识别,平均识别率为86.48%,第二声识别率提高的幅度最大,提高了5.18%;在不同的信噪比下,识别率最大可提高5.62%。For the non-stability of speech signals, we use SPWD to clearly manifest the vowel speech signals on time-frequency plane. The variation features of time-frequency ridges differ from different speech tones. We use threshold and refined processing to convert SPWD time-frequency matrix to a binary matrix image, and use Hough transform to extract ridge lines. But the time-frequency ridge of the third tone is curve, the line segment obtained by Hough transform is fitted with the method of least squares polynomial; We select some points equidistantly on time-frequency ridge line, and use the point set and its first difference as the feature of the time-frequency ridge, then use Gaussian mixture model (GMM) to conduct recognition and classification. Simulation experimental results show that this method is very good to the tone recognition and its average recognition rate is 86.48%. The improvement extent of the second tone' s recognition rate is the highest, as high as5.18%. And in different SNR, the maximum improvement of recognition rate reaches 5.62%.
关 键 词:声调识别 平滑伪Wigner—Ville分布 时频脊 HOUGH变换 最小二乘法多项式拟合
分 类 号:TN912.3[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.141.12.150