汉语声调识别中的基音后处理方法  

Approaches to pitch processing in tone recognition of Mandarin

在线阅读下载全文

作  者:周韡[1] 梁维谦[2] 刘润生[3] 

机构地区:[1]清华大学微电子研究所,北京100084 [2]清华大学电子工程系,北京100084 [3]清华大学信息科学与技术国家实验室

出  处:《桂林电子科技大学学报》2008年第3期214-218,共5页Journal of Guilin University of Electronic Technology

基  金:北京凌声芯语音科技项目(2008)

摘  要:汉语是一种带有声调的语言,声调信息主要体现在韵母的基音轨迹中,但是由于提取的基音不够稳健,所以必须要对基音进行后处理。通过归纳以帧为单位和以韵母为单位的两类基音后处理方式,并在第一种方式中提出基于韵母平均值进行归一化算法,在第二种方式中提出了帧叠靠前和帧叠靠后的韵母四等分长算法,经实验结果(以标准HTK为平台)得出后者更优的结论。考虑到前后声调的影响,采用声调三音子模型进行声调识别测试,可以比单音子模型识别效果提高10%左右。Mandarin is a tonal language and the tonal information is included in the segment of FINALs. The processing of abstracted pitch is necessary because of its instability. Two approaches to pitch processing are presented in this paper,frame-based unit recognition and FINAL-based unit recognition. In the former approach, the FI- NAL-average logarithm scaling method is put forward and is compared with the baseline system. In the latter approach, the FINAL-four equal-length overlap forward/backward algorithms are designed and compared respectively. It is concluded that the latter way is superior. Considering the co-articulation effect, we tested the above algo-rithms with tone tri-phone model and found that the tri-phone model can strengthen the ration of recognition by approximately 10 %.

关 键 词:语音识别 声调识别 基音后处理 帧叠靠前/帧叠靠后四等分均值算法 声调三音子模型 

分 类 号:TP391.42[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象