一种基于音频分割的音频分类算法  被引量:1

An Audio Classification Algorithm based on Audio Segmentation

在线阅读下载全文

作  者:杨贵安 邵玉斌[1] 龙华[1] 杜庆治[1] YANG Guian;SHAO Yubin;LONG Hua;DU Qingzhi(Kunming University of Science and Technology,Kunming Yunnan 650500,China)

机构地区:[1]昆明理工大学,云南昆明650500

出  处:《通信技术》2021年第2期317-322,共6页Communications Technology

摘  要:为解决单一语音、音乐音频及其两者的混合音频进行语音/音乐分类时分类结果不准确的问题,提出一种基于音频分割的音频分类算法。利用能熵比特征进行音频分割,分割出的音乐段较为准确,而利用幅度均方根特征进行音频分割,分割出的语音段较为准确,避免了对语音段的过度分割。将两种分割方法分割所得音频段的起点和终点升序排列并两两组合形成新的音频段作为音频分割结果,音频分割结果中的每一个音频段即一种类型的音频。对音频分割结果中的每一个音频段提取幅度的峰态系数和平均基频两个特征,并利用高斯混合模型作为后端分类器进行分类。最后为了消除过分割现象,将同类型的相邻音频段合并便得到最终分类结果。实验结果表明,所提出的算法对混合音频具有很高的分割准确率,达到98.24%,对单一音频和混合音频仅提取二维特征便得到较高的分类准确率,分别达到98%和98.61%,与同类算法相比较分类准确率平均提高3.80%。To solve the problem of inaccurate classification results for single speech,music audio and the mixed audio of the two,an audio classification algorithm based on audio segmentation is proposed.Using the energy-entropy ratio feature for audio segmentation,the segmented music segment is more accurate,while using the amplitude root-mean-square feature for audio segmentation;the segmented speech segment is more accurate,avoiding excessive segmentation of the speech segments.The starting and ending points of the audio segments obtained by the two segmentation methods are arranged in ascending order and combined in pairs to form a new audio segment as the audio segmentation result.Each audio segment in the audio segmentation result is a type of audio.Two features involving the peak-state coefficient of amplitude and the average fundamental frequency are extracted for each audio segment in the audio segmentation result,and the Gaussian mixture model is used as the back-end classifier for classification.Finally,in order to eliminate the phenomenon of over-segmentation,the final classification result is obtained by combining adjacent segments of the same type.The experimental results show that the proposed algorithm has a very high segmentation accuracy of 98.24%for mixed audio,and a relatively high classification accuracy of 98%and 98.61%respectively for single audio and mixed audio with only 2-dimensional features extracted,which is an average improvement of 3.80%as compared with similar algorithms.

关 键 词:音频分类 音频特征 音频分割 幅度的峰态系数 平均基频 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象