基于改进EfficientNet的煤矸音频分类方法

Coal gangue audio classification method based on improved EfficientNet

作　　者：宋庆军[1] 焦守悦姜海燕[1] 宋庆辉[1] 郝文超 SONG Qingjun;JIAO Shouyue;JIANG Haiyan;SONG Qinghui;HAO Wenchao(School of Intelligent Equipment,Shandong University of Science and Technology,Tai'an 271000,China)

机构地区：[1]山东科技大学智能装备学院,山东泰安271000

出　　处：《工矿自动化》2025年第1期138-144,共7页Journal Of Mine Automation

基　　金：国家自然科学基金面上项目(52174145);山东省科技型中小企业创新能力提升工程项目(2022TSGC1271,2023TSGC0620)。

摘　　要：针对煤矸音频特征提取过程中设备运行噪声干扰严重及单一提取方法易导致信息丢失的问题,提出了一种基于改进EfficientNet的煤矸音频分类方法。采用基于Mel频谱和Gammatone倒谱系数的特征提取方法,有效捕捉矸石声音中的低频信息和细节特征。选择EfficientNet-B0作为骨干网络,并对其进行以下改进:将原有的多尺度通道注意力模块换成卷积块注意力模块,得到卷积注意力特征融合(CAFF)模块,通过网络自学习为不同空间位置的特征分配不同的权重信息,生成新的有效特征;在原有的MBConv模块中并行嵌入频域通道注意力(FCA)模块,加强特征图的表达能力,从而提高整个网络的性能。实验结果表明:引入CAFF模块后,模型准确率提升了0.61%,F1得分提升了0.52%,且模型收敛更快,说明CAFF模块有效提升了模型对频谱特征的捕捉能力;引入FCA模块后,准确率提升了0.45%,F1得分提升了0.62%,说明模块的叠加可以进一步提高模型的泛化能力和处理复杂特征的能力;改进EfficientNe模型的准确率为91.90%,标准差为0.108,显著优于同类对比音频分类模型。To address the issues of severe interference of equipment operating noise and information loss caused by single extraction methods during coal gangue audio feature extraction,a coal gangue audio classification method based on improved EfficientNet is proposed.The method adopted a feature extraction approach combining Mel spectrogram and Gammatone frequency cepstral coefficients to effectively capture lowfrequency information and detailed features in gangue audio.EfficientNet-B0 was selected as the backbone network,and the following improvements were made:the original multi-scale channel attention module was replaced with a convolutional block attention module,resulting in the Convolutional Attention Feature Fusion(CAFF)module.This module allowed the network to autonomously assign different weight information to features in different spatial positions,generating new effective features.Additionally,a Frequency-domain Channel Attention(FCA)module was embedded in parallel within the original MBConv module,strengthening the representation ability of feature maps and thereby improving overall network performance.The experimental results demonstrated that after introducing the CAFF module,the model's accuracy improved by 0.61%,the F1 score increased by 0.52%,and convergence was faster,indicating that the CAFF module effectively enhanced the model's ability to capture spectral features.After integrating the FCA module,accuracy improved by 0.45%,and the F1 score increased by 0.62%,showing that combining these modules further enhanced the model's generalization ability and its ability to process complex features.The improved EfficientNet model achieved an accuracy of 91.90%,with a standard deviation of 0.108,significantly outperforming other comparable audio classification models.

关键词：综放开采煤矸识别音频特征提取 EfficientNet Mel频谱特征 Gammatone倒谱系数注意力机制

分类号：TD823.49[矿业工程—煤矿开采]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进EfficientNet的煤矸音频分类方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进EfficientNet的煤矸音频分类方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索