基于MFCC提取和DTW优化的连续音频识别算法设计  

Design of Continuous Audio Recognition Algorithm Based on MFCC Extraction and DTW Optimization

在线阅读下载全文

作  者:王鸿瑞 张玉辰 陈鹭 高博韬 高昕悦 Wang Hongrui;Zhang Yuchen;Chen Lu;Gao Botao;Gao Xinyue(Xi'an Jiaotong University,Xi'an,710049,China)

机构地区:[1]西安交通大学,陕西西安710049

出  处:《中国现代教育装备》2024年第17期41-45,52,共6页China Modern Educational Equipment

摘  要:介绍了一种新型的利用梅尔频率倒谱系数(MFCC)提取和动态时间规整技术(DTW)优化的连续音频识别算法。首先对数学原理与算法步骤进行设计与规划,使用大规模音频数据库进行预处理,经过时域和频域分析提取相应的特征;然后利用双门限法把连续音频切分为不同的音频块,并对切分部分进行针对性识别,将其与时频域数据库的模板进行匹配比对,实现了较好的连续音频识别效果,在时域和频域识别上的准确性均能达到89%。该研究成果可应用于钢琴教学系统的开发,尤其是在辅助学习者正确弹出曲谱方面具有广阔的应用前景。This paper introduces a new continuous audio recognition algorithm based on Mel Frequency Cepstral Coefficents(MFCC) extraction and Dynamic Time Warping(DTW) optimization.Firstly,the mathematical principles and algorithm steps are planned and designed.With the large-scale audio database preprocessed,the corresponding features are extracted through time domain and frequency domain analysis.Then,continuous audio is segmented into different audio blocks by double threshold method,and the segmented part is identified pertinently,and the template of time-frequency domain database is matched and compared to achieve a better continuous audio recognition effect.In the time domain and frequency domain recognition can reach 89% accuracy.The research results can be applied to the development of piano teaching system,especially in assisting learners to correctly play music,which has broad application prospects.

关 键 词:语音识别 端点检测 梅尔频率倒谱系数 动态时间规整算法 时频域分析 

分 类 号:TN912.34[电子电信—通信与信息系统] G434[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象