中文语音合成系统中的一种两层韵律结构生成体系(英文) 被引量：2

A Two-stage Prosodic Structure Generation Strategy for Mandarin Text-to-speech Systems

机构地区：[1]Beijing University of Posts and Telecommunications [2]France Telecom R&D(Beijing)

出　　处：《自动化学报》2010年第11期1569-1574,共6页Acta Automatica Sinica

基　　金：Supported by National Natural Science Foundation of China(90920001);the Key Project of the Ministry of Education of China(108012);Joint-research Project between France Telecom R&DBeijing and Beijing University of Posts and Telecommunications(SEV01100474)

摘　　要：Prosodic structure generation is the key component in improving the intelligibility and naturalness of synthetic speech for a text-to-speech (TTS) system. This paper investigates the problem of automatic segmentation of prosodic word and prosodic phrase,which are two fundamental layers in the hierarchical prosodic structure of Mandarin,and presents a two-stage prosodic structure generation strategy. Conditional random fields (CRF) models are built for both prosodic word and prosodic phrase prediction at the front end with diflerent feature selections. Besides,a transformation-based error-driven learning (TBL) modification module is introduced in the back end to amend the initial prediction. Experiment results show that the approach combining CRF and TBL achieves an F-score of 94.66%.Prosodic structure generation is the key component in improving the intelligibility and naturalness of synthetic speech for a text-to-speech （TTS） system. This paper investigates the problem of automatic segmentation of prosodic word and prosodic phrase,which are two fundamental layers in the hierarchical prosodic structure of Mandarin,and presents a two-stage prosodic structure generation strategy. Conditional random fields （CRF） models are built for both prosodic word and prosodic phrase prediction at the front end with diflerent feature selections. Besides,a transformation-based error-driven learning （TBL） modification module is introduced in the back end to amend the initial prediction. Experiment results show that the approach combining CRF and TBL achieves an F-score of 94.66%.

关键词：中文语音合成系统两层韵律结构生成体系计算机技术自动化系统

分类号：TN912.33[电子电信—通信与信息系统] TP3[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

中文语音合成系统中的一种两层韵律结构生成体系(英文) 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

中文语音合成系统中的一种两层韵律结构生成体系(英文) 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索