检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中国科学技术大学电子科学与技术系语音信息处理实验室,合肥230001
出 处:《小型微型计算机系统》2016年第5期1107-1111,共5页Journal of Chinese Computer Systems
摘 要:传统的说话人识别系统在噪声环境下的识别率较低.基于计算听觉场景分析得到的二值掩码可以对噪声占主导部分进行重建,从而将与说话人相关的被破坏的信息重建起来.但是重建的效果受到该帧中可靠帧的比例的影响.因此,根据提取的二值掩码来设定阈值,从而对测试特征的帧进行选取,将测试特征的帧划分为三类,分别用于重建、保留和丢弃.最终使用重建后的帧和保留的帧进行后续处理,并用于识别过程.实验结果表明,相较于原来的重建系统,该算法的识别率有了一定的提高.Conventional sperker recognition system perform pooly under noisy conditions. The extracted Binary Mask based on Computational auditory scene analysis can reconstruct the noise dominanted part of the speech, so that the information which is related to the speaker and destroyed can be rebuilt. However, the result is affected by the ratio of the reliable of the frame. Therefore, this paper set a threshold based on the extracted binary mask and use the threadshold to select frames. The frame is divided into three respectively, for reconstruction, retain and discard. The reconstructed and the retained frame will be used to identification. Experimental results show that compared to the original reconstruction system, the recognition rate of the algorithm has been improved.
关 键 词:计算听觉场景分析 Gammatone频率倒谱系数(GFCC) 理想二值掩码(IBM) 阈值
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.139.86.62