网络剪枝与知识蒸馏相结合的轻量级鸟声识别方法  

A light-weight bird sound recognition method combining network pruning and knowledge distillation

作  者:申小虎[1,2] 李冠宇 史洪飞[2] 王传之 SHEN Xiaohu;LI Guanyu;SHI Hongfei;WANG Chuanzhi(Department of Forensic Science and Technology,Jiangsu Police Institute,Nanjing 210031,China;National Forestry and Grassland Administration,Key Laboratory of National Forestry and Grassland Administration on Wildlife Evidence Technology,Nanjing 210023,China;College of Information Science and Technology,Dalian Maritime University,Dalian 116026,China;iFLYTEK Co.,Ltd.,Hefei 230088,China)

机构地区:[1]江苏警官学院刑事科学技术系,南京210031 [2]国家林业和草原局野生动植物物证技术国家林业和草原局重点实验室,南京210023 [3]大连海事大学信息科学与技术学院,大连116026 [4]科大讯飞科技有限公司,合肥230088

出  处:《应用声学》2025年第2期350-361,共12页Journal of Applied Acoustics

基  金:国家自然科学基金项目(61976032);野生动植物物证技术国家林业和草原局重点实验室开放课题(KLNPC2102);2023江苏省高等学校优秀科技创新团队“人工智能框架下的法庭毒理学”项目。

摘  要:在鸟声识别应用中,算法模型多数采用参数密集型,缺少能够搭载至被动声学监测设备的高效网络。针对EfficientNet网络结构特点,将结构化剪枝与知识蒸馏方法相结合,确保剪枝后的网络保持良好的泛化能力,能够满足不同资源配置条件下的网络需求。一方面,通过逆背包准则建立了剪枝通道与资源间的信息表述,在保留网络框架条件下完成通道剪枝。另一方面,在知识蒸馏方法中通过加入MBConv模块内部蒸馏损失分量并完成训练,确保跨组信息交换保留了剪枝前后特征映射之间的距离。通过对南京浦口区老山森林中收集的10类鸟声检测分类实验,在压缩后网络参数量仅3.0M的条件下,分类精度可达到91.64%。该文所提方法在完成网络规模压缩的同时,较好地保留了分类精度,与相同规模主流轻量级网络相比较,能更好地适应鸟声识别被动声学监测的设备需求。In bird sound recognition,most algorithm models heavily rely on parameters and lack efficient networks compatible with passive acoustic monitoring(PAM)equipment.The deficit becomes pronounced given the typically complex of EfficientNet network structure.This challenge is addressed by synergistically combining structured pruning and knowledge distillation techniques,which upholds the generalization capability of the pruned EfficientNet network while accommodating diverse resource allocation conditions.The network pruning was reconceptualized using an inverse knapsack criterion.This strategic approach facilitates channel pruning while maintaining the network’s foundational architecture.Simultaneously,in the knowledge distillation method,by adding internal distillation loss components between the MBConv modules and con ducting subsequent training,the preservation of the disparity between feature maps before and after pruning in cross-group information exchange was ensured.Through experiments involving the classification of ten distinct bird sound types recorded within the Laoshan Forest in Pukou District,Nanjing,a classification accuracy of 91.64%with a compressed network parameter of merely 3.0 M was obtained.This approach achieves network scale compression while preserving classification accuracy.Compared to mainstream methods of equivalent scale,the proposed technique more adeptly meets the requirements of PAM equipment tailored for bird sound recognition.

关 键 词:网络剪枝 知识蒸馏 鸟声识别 轻量级网络 被动声学监测 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TN912.3[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象