联合多频带非线性方法的病理嗓音识别研究  

Research on Pathological Voice Recognition Based on Multi-band Nonlinear Method

在线阅读下载全文

作  者:赵品辉 叶翔宇 严潇远 张莉丽 陶智 张晓俊 Zhao Pinhui;Ye Xiangyu;Yan Xiaoyuan;Zhang Lili;Tao Zhi;Zhang Xiaojun(School of Optoelectronic Science and Engineering ,Soochow University ,Suzhou 215006,China)

机构地区:[1]苏州大学光电科学与工程学院

出  处:《信息化研究》2019年第3期26-30,共5页INFORMATIZATION RESEARCH

基  金:国家自然科学基金(No.61271359);江苏省研究生教育教学改革课题(No.JGLX19-141);江苏省高等学校大学生创新创业训练计划项目(No.201810285086X)

摘  要:文章采用多频带非线性技术,提出了一种病理嗓音识别特征的提取方法。首先采用符合人耳听觉特性的Bark子波滤波器组对语音信号进行滤波,并进行离散余弦变换提取特征,随后再提取各通道内的最大李雅普诺夫指数特征。将特征参数融合成多频带非线性参数后,采用美国MEEI病理嗓音数据库进行识别实验,并选用逻辑回归、多层感知器、支持向量机、随机森林及K最邻近分类器5种典型机器学习方法进行识别。实验结果表明,文中所提出的特征平均识别率达97%,相比梅尔频率倒谱系数、Bark频率倒谱系数、最大李雅普诺夫指数,分别有4%、4%、18%的提高,最高识别率达到99%。A method for extracting pathological voice recognition features by using a multi-band nonlinear technique is proposed in this paper. Firstly, the speech signal is filtered by the Bark wavelet filter group which conforms to human auditory characteristics. Then discrete cosine transform is used to extract features and the maximum Lyapunov exponent feature in each channel. After the feature parameters were merged into multiband nonlinear parameters, the MEEI pathology voice database was used for recognition experiments, and five typical machine learnings including logistic regression, multi-layer perceptron, support vector machine, random forest and K nearest neighbor classifier were used. The experimental results show that the average recognition rate of the proposed feature is 97%, which is 4%, 4%, and 18%, higher than the Mel frequency cestrum coefficient, Bark frequency cepstral coefficient, and maximum Lyapunov exponent. The highest recognition rate is 99%.

关 键 词:病理嗓音 多频带 非线性 Bark子波滤波器 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象