混合单元选择语音合成系统的目标代价构建  被引量:1

Hybrid unit seletion speech synthesis system target cost construction

在线阅读下载全文

作  者:蔡文彬 魏云龙[1] 徐海华[2] 潘林[1] CAI Wenbin;WEI Yunlong;XU Haihua;PAN Lin(College of Physics and Information Engineering,Fuzhou University,Fuzhou 350108,China;Temasek Laboratory,Nanyang Technological University,Singapore 639798,Singapore)

机构地区:[1]福州大学物理与信息工程学院,福州350108 [2]南洋理工大学Temasek实验室,新加坡639798

出  处:《计算机工程与应用》2018年第24期20-25,共6页Computer Engineering and Applications

基  金:福建省科技重大项目(No.2017H6009)

摘  要:合成语音的基元是通过最小化目标代价和拼接代价来选取。由于拼接基元涉及复杂的语言学、声学特性,如何选择能准确描述基元信息的声学特征(或语言学特征)并构建相应目标代价是提高合成语音质量的关键。从声学特征和声学模型两个方面对目标代价构建进行了探究。实验结果表明,经过相似语料训练后微调的深度声学网络模型,预测的瓶颈特征更能表征拼接基元特性,从而指导目标代价筛选理想候选单元,提高合成语音的质量。A general method of guiding concatenate unit selection is minimized the sum of target and concatenation cost. Since candidate units involve complex linguistic and acoustic properties, the key of improving synthesized speech quality is how to choose the acoustic (or lingustic) features that accurately represent units characteristics and constructed the corresponding target cost. This paper explores target cost construction from two aspects: acoustic characteristics (Mel-generalized cepstral, log fundamental frequency, bottle-neck feature) and acoustic models. Experimental results show that the Deep Neural Network(DNN) based acoustic model which trained by a big similar cropus and fine tuned with current cropus can predict more robust bottle-neck features, it can employ those features to participate the target cost calculation then guide optimal candidate units selection, and impove the quality of synthesized speech.

关 键 词:语音合成 目标代价 声学特征 声学模型 拼接基元 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象