检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《电子学报》2016年第9期2282-2288,共7页Acta Electronica Sinica
基 金:国家自然科学基金(No.61271360);江苏省自然科学基金(No.BK20131196)
摘 要:提出一种约束条件下的结构化高斯混合模型及非平行语料语音转换方法.从源与目标说话人的原始非平行语料中提取出少量相同音节,在结构化高斯混合模型的训练过程中,利用这些相同音节包含的语义信息及声学特征对应关系对K均值聚类中心进行约束,并在(Expectation Maximum,EM)迭代过程中对语音帧属于模型分量的后验概率进行修正,得到基于约束的结构化高斯混合模型(Structured Gaussian Mixture Model with Constraint condition,CSGMM).再利用全局声学结构(Acoustic Universal Structure,AUS)原理对源和目标说话人的约束结构化高斯混合模型的高斯分布进行匹配对准,推导出短时谱转换函数.主观和客观评价实验结果表明,使用该方法得到的转换后语音在谱失真,目标倾向性和语音质量等方面均优于传统的结构化模型语音转换方法,转换语音的平均谱失真仅为0.52,说话人正确识别率达到95.25%,目标语音倾向性指标ABX平均为0.82,性能更加接近于基于平行语料的语音转换方法.This paper proposes a structured Gaussian mixture model with constraint conditions( C-SGMM) for nonparallel corpora voice conversion. A small number of voice signals with the same syllables from the source and target nonparallel corpus are extracted as constraint conditions,then the correspondence between acoustic features of source and target corpus formed by these syllables are applied in the process of statistical acoustic model training. The constraint conditions are used to restrict the cluster centers of K-means clustering process,and they are also used in EMalgorithm to adjust the voice frame's posterior probability belonging to a Gaussian distribution component for model training. Then Gaussian distributions in source and target structured Gaussian mixture models are aligned using acoustic universal structure principle and the conversion function can be derived. Results of both subjective and objective experiments indicate that the conversion performance obtained by the proposed method are advanced to that of the traditional structured method in cepstrum distortion,target tendency and speech quality aspects. The average cepstrum distortion of converted speech is only 0. 52,the speaker recognition rate of the converted speech reaches 95. 25%,and the performance closer to the conventional parallel corpora GMMbased method is achieved.
关 键 词:语音转换 结构化高斯混合模型 非平行语料 约束条件
分 类 号:TN912.33[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.188