融合空间与通道重构卷积和注意力的轻量型动物姿态估计

Lightweight Animal Pose Estimation with Integrated Spatial and Channel Reconstructive Convolutions and Attention

作　　者：宰清鹏徐杨 ZAI Qingpeng;XU Yang(College of Big Data and Information Engineering,Guizhou University,Guiyang 550025,China;Guiyang Aluminum-Magnesium Design and Research Institute Co.,Ltd.,Guiyang 550009,China)

机构地区：[1]贵州大学大数据与信息工程学院,贵阳550025 [2]贵阳铝镁设计研究院有限公司,贵阳550009

出　　处：《计算机工程与应用》2025年第6期282-294,共13页Computer Engineering and Applications

基　　金：贵州省科技计划项目(黔科合支撑[2023]一般326)。

摘　　要：动物姿态估计在行为生态学、动物健康监测、野生动物保护等领域的重要性不断凸显。然而,目前主流的动物姿态估计算法过于关注准确率,导致网络复杂度和计算成本不断攀升,这使得在移动设备和嵌入式平台上的应用受到了限制。针对该问题,提出融合空间与通道重构卷积和金字塔分割注意力的多尺度动物姿态估计网络SPANet。使用金字塔分割注意力与坐标注意力机制,重新设计了高分辨率网络的瓶颈层EPSAneck,在减轻过度使用大卷积核带来的计算成本的同时,增强了网络对有用特征的提取能力;提出了基于空间和通道重构卷积以及坐标注意力机制的SCCAblock基础模块,在显著减少计算冗余和内存访问的同时,增强了通道与空间之间的信息交互;利用反卷积模块对网络输出的特征融合方式进行重新设计,进一步提升了网络的准确率。实验结果表明,提出的网络模型相较于高分辨率网络在AP10K测试集上的平均精度提升了1.8个百分点,同时浮点运算量降低了48.5%、模型参数量减少了67.0%。在AnimalPose数据集上,浮点运算量降低49.5%,模型参数量降低67.0%。实验数据表明,该网络可在降低模型复杂度的同时实现预测精度的小范围提升。The importance of animal pose estimation in fields such as behavioral ecology,animal health monitoring,and wildlife conservation has been increasingly emphasized.However,current mainstream algorithms for animal pose estimation tend to prioritize accuracy,leading to a continuous increase in network complexity and computational cost,which limits their application on mobile devices and embedded platforms.In response to this issue,this paper proposes a multiscale animal pose estimation network called SPANet,which combines spatial and channel-reconstructing convolutions with pyramid split attention.Firstly,the bottleneck layer EPSAneck of the high-resolution network is redesigned by incorporating pyramid split attention and coordinate attention mechanisms.This redesign not only reduces the computational cost caused by excessive use of large convolutional kernels but also enhances the ability of network to extract useful features.Secondly,the SCCAblock foundational module is introduced,which is based on spatial and channel-reconstructing convolutions as well as coordinate attention mechanisms.This module significantly reduces computational redundancy and memory access while enhancing information exchange between channels and spatial dimensions.Lastly,the fusion method of network output features is re-designed using deconvolution modules to further improve the accuracy of the network.Experimental results demonstrate that compared to the high-resolution network,the proposed network model achieves an average precision improvement of 1.8 percentage points on the AP10K test set,while reducing the floatingpoint operations by 48.5% and the number of model parameters by 67.0%.On the AnimalPose dataset,the floating-point operations are reduced by 49.5%,and the number of model parameters is reduced by 67.0%.The experimental data indicate that the proposed network model achieves a small-range improvement in prediction accuracy while reducing the complexity of the model.

关键词：动物姿态估计轻量型高分辨率注意力机制空间与通道重构卷积

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合空间与通道重构卷积和注意力的轻量型动物姿态估计

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合空间与通道重构卷积和注意力的轻量型动物姿态估计

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索