Emotional speaker recognition based on prosody transformation 被引量：1

基于韵律变换的情感说话人识别(英文)

机构地区：[1]东南大学水声信号处理教育部重点实验室,南京210096 [2]佛山科学技术学院,佛山528000

出　　处：《Journal of Southeast University(English Edition)》2011年第4期357-360,共4页东南大学学报（英文版）

基　　金：The National Natural Science Foundation of China (No.60872073, 60975017, 51075068);the Natural Science Foundation of Guangdong Province (No. 10252800001000001);the Natural Science Foundation of Jiangsu Province (No. BK2010546)

摘　　要：A novel emotional speaker recognition system （ESRS） is proposed to compensate for emotion variability. First, the emotion recognition is adopted as a pre-processing part to classify the neutral and emotional speech. Then, the recognized emotion speech is adjusted by prosody modification. Different methods including Gaussian normalization, the Gaussian mixture model （GMM） and support vector regression （SVR） are adopted to define the mapping rules of F0s between emotional and neutral speech, and the average linear ratio is used for the duration modification. Finally, the modified emotional speech is employed for the speaker recognition. The experimental results show that the proposed ESRS can significantly improve the performance of emotional speaker recognition, and the identification rate （IR） is higher than that of the traditional recognition system. The emotional speech with F0 and duration modifications is closer to the neutral one.为了解决由情感变化引起的说话人识别性能下降问题,提出了一种新的情感说话人识别系统. 首先,通过引入情感识别作为前端处理模块,对中性语音和情感语音进行分类. 然后,对情感语音进行韵律修正,分别采用高斯归一化、高斯混合模型( GMM) 和支持向量回归( SVR) 等方法建立情感语音和中性语音的基频映射规则,并根据平均线性变化率对时长进行了修正. 最后,对韵律修正后的情感语音进行识别. 实验结果表明,提出的情感说话人识别系统可以有效地提高情感说话人识别的性能,识别率相比传统方法有了显著的提高. 并且通过基频和时长修正的情感语音更接近于中性语音.

关键词：emotion recognition speaker recognition F0 transformation duration modification

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Emotional speaker recognition based on prosody transformation 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Emotional speaker recognition based on prosody transformation 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索