检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:段云 邵玉斌[1] 龙华[1] 杜庆治[1] DUAN Yun;SHAO Yubin;LONG Hua;DU Qingzhi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650504,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650504
出 处:《微电子学与计算机》2024年第5期99-108,共10页Microelectronics & Computer
基 金:国家自然科学基金(61761025)。
摘 要:针对灰度对数语谱图对基频拉伸幅度过大,短时长语音识别率提升受限的问题,提出一种非线性语谱图联合判决的语种识别方法。首先,对语音进行能量归一化,提取对数功率谱,将频率刻度按照人耳听觉感知进行非线性映射得到非线性语谱图。然后,将非线性语谱图按词关联特性进行等间隔拆分,在ResNet网络后端加入联合判决层;输出语音所属语种类型。实验结果表明,所提方法有效改善灰度对数语谱图的缺点,识别性能均高于语谱图及改进特征。联合判决对切分时长为1.0 s的样本语音取得的识别效果最佳,在广播音频数据集中,识别率达到94.25%;在VoxForge公共语料集中,识别率达到98.94%。To address the problem that the gray-scale logarithmic speech spectrogram is too stretched to the fundamental frequency,which limits the improvement of short-length speech identification rate,a language identification method with joint judgment of nonlinear speech spectrogram is proposed.Firstly,the logarithmic power spectrum is extracted by energy normalization,and the nonlinear speech spectrogram is obtained by nonlinear mapping of frequency scales according to human ear perception.Then,the nonlinear speech spectrogram is split into equal intervals according to word association characteristics,and the joint judgment layer is added at the back end of the ResNet network.Finally,the language type of the speech is output.The experimental results show that the proposed method can effectively improve the shortcomings of the gray-scale logarithmic speech spectrogram,and the recognition performance is higher than that of the speech spectrogram and the improved features.The best recognition results are obtained for the sample speech with a cut time of 1.0 s,and the recognition rate reaches 94.25%in the broadcast audio data set and 98.94%in the VoxForge public corpus.
分 类 号:TN912.3[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.5.184