检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈逸灵 程艳芬[1] 陈先桥[1] 王红霞[1] 李超 CHEN Yiling;CHENG Yanfen;CHEN Xianqiao;WANG Hongxia;LI Chao(School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430063,China)
机构地区:[1]武汉理工大学计算机科学与技术学院,武汉430063
出 处:《哈尔滨工业大学学报》2018年第11期160-166,共7页Journal of Harbin Institute of Technology
基 金:国家自然科学基金(51179146)
摘 要:离散情感描述模型将人类情感标注为离散的形容词标签,该类模型只能表示有限种类的、单一明确的情感类型,而维度情感模型从情感的多个维度量化了复杂情感的隐含状态.另外,常用的语音情感特征梅尔频率倒谱系数(MFCC)存在因分帧处理引起相邻帧谱特征之间相关性被忽略问题,容易丢失很多有用信息.为此本文提出改进方法,从语谱图中提取时间点火序列特征、点火位置信息特征对MFCC进行补充,将这三种特征分别用于语音情感识别,根据识别结果从PAD维度情感模型的三个维度(Pleasure-displeasure愉悦度、Arousal-nonarousal激活度、Dominance-submissiveness优势度)进行相关性分析得到特征的权重系数,加权融合后获得情感语音的最终PAD值,将其映射至PAD三维情感空间中.实验表明,增加的时间点火序列、点火位置信息不但能探测说话人的情感状态,同时考虑了相邻频谱间的互相关信息,与MFCC特征形成互补,在提升基本情感类型离散识别效果的基础上,将识别结果表示为PAD三维情感空间中的坐标点,采用量化的方法揭示情感空间中各种情感的定位与联系,展示出情感语音中糅杂的情感内容,为后续复杂的语音情感分类识别奠定研究基础.The discrete emotional description model labels human emotions as discrete adjectives.The model can only represent limited types of single and explicit emotion.The dimensional emotional model quantifies the implied state of complex emotions from the multiple dimensions.In addition,conventional speech emotion feature,Mel Frequency Cepstral Coefficient(MFCC),has the problem of neglecting the correlation between the adjacent frame spectral features due to frame division processing,making it susceptible to loss of much useful information.To solve this problem,this paper proposes an improved method,which extracts the time firing series feature and the firing position information feature from the spectrogram to supplement the MFCC,and applies them in speech emotion estimation respectively.Based on the predicted values,the proposed method calculates the correlation coefficients of each feature from three dimensions,P(Pleasure-displeasure),A(Arousal-nonarousal),and D(Dominance-submissiveness),as feature weights and obtains the final values of PAD in emotion speech after the weighted fusion,and finally maps it to PAD 3D emotion space.The experiments showed that the two added features could not only detect the emotional state of the speaker,but also consider the correlation between the adjacent frame spectral features,complementing to MFCC features.On the basis of improving the effect of discrete estimation of basic emotional types,this method represents the estimation results as coordinate points in PAD 3D emotion space,adopts the quantitative method to reveal the position and connection of various emotions in the emotion space,and indicates the emotion content mixed in the emotion speech.This study lays a foundation for subsequent research on classification estimation of complex speech emotions.
关 键 词:PAD三维情感模型 语音情感识别 梅尔频率倒谱系数 时间点火序列 点火位置信息 相关性分析
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117