检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]清华大学计算机科学与技术系智能技术与系统国家重点实验室,北京100084
出 处:《清华大学学报(自然科学版)》2004年第1期61-64,共4页Journal of Tsinghua University(Science and Technology)
摘 要:声学建模是汉语连续语音识别中的关键步骤之一。根据汉语语音的特点,采用扩展声韵母(XIF)作为识别基元,并针对XIF基元设计了相应的问题集,利用基于决策树的状态共享策略建立上下文相关声韵模型(Tri-XIF)。将Tri-XIF模型与上下文相关音素模型(Tri-phone)、上下文无关音节模型进行了对比。提出了几种方法用于改善标注、改进问题集和降低模型规模。实验结果表明,Tri-XIF模型与Tri-phone模型、音节模型相比,识别性能有了很大提高,其音节误识率分别降低了24.53%和41.65%。采用了所提出的优化策略后,模型规模降低20%以上,而性能下降很少。Acoustic modeling is very important for continuous Chinese speech recognition. The extended Initial/Final (XIF) set chosen as the basic speech recognition unit set to analyze the Chinese language characteristics outperformed the standard IF set. Decision tree-based state tying technology was used to construct the context dependent Initial/Final acoustic model (Tri-XIF model), with an appropriate question set design based on Chinese linguistic knowledge. Methods were developed to optimize the Tri-XIF modeling, including transcription refinement, question set extension, and model size reduction. Tests show that the Tri-XIF modeling is much better than either Tri-phone modeling or syllable modeling, with the syllable error rate reduced by 24.53% relative to the Tri-phone model and 41.65% relative to syllable model. More than 20% model size reduction was obtained with little performance deterioration using the methods in the Tri-XIF model.
关 键 词:汉语 连续语音识别 上下文相关 声母 韵母 决策树
分 类 号:TN912.34[电子电信—通信与信息系统] TP391.12[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200