面向汉语统计参数语音合成的标注生成方法  被引量:1

Label generation for Chinese statistical parametric speech synthesis

在线阅读下载全文

作  者:郝东亮[1] 杨鸿武[1] 张策[1] 张帅[1] 郭立钊 杨静波[1] HAO Dongliang;YANG Hongwu;ZHANG Ce;ZHANG Shuai;GUO Lizhao;YANG Jingbo(College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou 730070, China)

机构地区:[1]西北师范大学物理与电子工程学院,兰州730070

出  处:《计算机工程与应用》2016年第19期146-153,共8页Computer Engineering and Applications

基  金:国家自然科学基金(No.61263036;No.61262055);甘肃省青年科技研究基金(No.1208RJYA078;No.1107RJZA112);西北师大青年教师科研能力提升计划项目(No.NWNU-LKQN-12-27)

摘  要:针对汉语统计参数语音合成中的上下文相关标注生成,设计了声韵母层、音节层、词层、韵律词层、韵律短语层和语句层6层上下文相关的标注格式。对输入的中文语句进行文本规范并利用语法分析获得语句的结构和分词信息;通过字音转换获得每个汉字的声韵母及声调;利用TBL(Transformation-Based error driven Learning)算法预测输入文本的韵律词边界和韵律短语边界。在此基础上,获得输入文本中每个汉字的声韵母信息及其上下文结构信息,从而产生统计参数语音合成所需的上下文相关标注。设计了一个以声韵母为合成基元的普通话的基于隐Markov模型(HMM)的统计参数语音合成系统,通过主、客观实验评测了不同标注信息对合成语音音质的影响,结果表明,上下文相关的标注信息越丰富,合成语音的音质越好。This paper designs a six-level context-dependent label format, which includes an initial and final level, a syllablelevel, a word level, a prosodic word level, a prosody phrase level and a sentence level, for Chinese statistical parametricspeech synthesis. The input Chinese sentence is firstly normalized and performs grammar analysis to obtain sentence structureand word segmentation information. Then the initial, final and tone of Chinese character are obtained by graphemeto-phoneme conversion. The Transformation-Based error driven Learning(TBL)algorithm is finally employed to predictthe prosodic word boundary and prosodic phrase boundary of the input sentence. Context-dependent labels of each sentencefor statistical parametric speech synthesis are generated according to the context information obtained from abovetext analysis and prosodic prediction procedures. A Hidden Markov Model(HMM)based Mandarin statistical parametricspeech synthesis is designed to evaluate the influences of different labels on quality of synthesized speech. Tests show thatmore context-dependent label information can achieve higher quality of synthesized speech.

关 键 词:文本分析 语音合成 上下文相关标注 韵律预测 字音转换 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象