检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:白帆[1] Jont B.Allen
机构地区:[1]电子科技大学电子工程学院,四川成都611731 [2]伊利诺伊大学厄巴拿香槟分校,电子计算机工程系,61801
出 处:《信号处理》2015年第6期727-736,共10页Journal of Signal Processing
基 金:美国National Institute of Health(Grant No.R21-RDC009277A)
摘 要:语音识别系统的性能受许多因素的影响,如不同的说话人、说话方式、环境噪音等。为了提高系统的识别率和稳定性,一种重要的解决方法是寻找更好的、高强健性的基于人耳听觉感知特性的感知线索。基于此,三维深度研究方法(3DDS)被发明,用来探究语音信号在人耳内部的感知线索,并已成功的运用于对摩擦音和爆破音的感知线索识别。本文将这种方法拓展到鼻辅音的感知线索研究。在三个感知实验结果分析的基础上,定义了冗余感知线索和次要感知线索,并找到了/m/的感知线索是大约位于363~1250 Hz的语音部分,/n/的感知线索是大约位于939~2826 Hz的语音部分。The performance of a speech recognition system is affected by many factors, such as different speakers, speaking style, ambient noise etc. In order to improve the system' s ability to be more accurate and robust despite these factors, one important solution is to look for some better and more robust representations of the acoustic signal based on the principle of human perceptional feature. The human internal acoustic representation has previously been investigated by using the 3-Di- mensional Deep Search (3DDS) method. This method has proven successful in finding perceptual cue of plosive and frica- tive consonants in natural speech. In this paper, the method is extended to predict the perceptual cues for the nasal conso- nants/m, n/. Based on analysis of the results from three experiments, the redundant cue and secondary perceptual cue are defined. The perceptual cue of/m/is speech component lying around from 363 ~ 1250 Hz and the perceptual cue of/m/is speech component lying around from 939 ~ 2826 Hz.
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229