一种基于MASM的口形轮廓特征提取方法及听视觉语音识别被引量：1

A Lip Contour Extraction Method Based on Multiple Active Shape Model (MASM) for Audio Visual Speech Recognition

出　　处：《西北工业大学学报》2004年第5期674-678,共5页Journal of Northwestern Polytechnical University

基　　金：中国科技部与比利时弗拉芒大区国际科技合作项目 (国科外 19990 2 0 9号 )资助

摘　　要：提出了一种用于听视觉语音识别的基于 MASM的口形轮廓提取方法 ,这种方法只需要少量的训练数据就可以实现对大量口形轮廓的准确提取。还引入了一种口形轮廓的平滑修正方法 ,该方法利用口形连续变化的特点 ,对错误轮廓进行修正。实验证明 ,利用该方法提取轮廓的准确率比常规 ASM模型高出 2 0个百分点 ;将该口形轮廓特征引入到听视觉语音识别中。In audio visual speech recognition and lipreading, the widely used ASM (Active Shape Model) for lip contour extraction suffers from the lack of robustness and cannot extract the exact lip contours due to the various mouth shape changes when uttering. We present a more robust model——Multiple Active Shape Model (MASM). The model classifies the mouth shapes into closed mouth set, half-opened mouth set, and round mouth set. An independent ASM is built for each different set with a tiny set of the training data. The MASM contour extraction algorithm automatically selects the best accurate lip contour from multiple shape searching procedures. Considering the consecutive changes of the mouth, a method for smoothing lip contours is also presented to correct the contour extraction errors. Experimental results from AVCONDIG database show that extraction accuracy achieved by the MASM is 13% higher than that of conventional ASM. The combination of the MASM and the contour-smoothing method leads to another 7% accuracy improvement. With the fusion of the exact lip contour feature and audio MFCC (Mel Frequency Cepstral Coefficients) feature, the average word recognition rates of the considered connected-digits speech recognition task are considerably increased under noisy acoustic conditions.

关键词：语音识别听视觉语音识别 ASM MASM 口形轮廓提取

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于MASM的口形轮廓特征提取方法及听视觉语音识别被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于MASM的口形轮廓特征提取方法及听视觉语音识别 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种基于MASM的口形轮廓特征提取方法及听视觉语音识别被引量：1