并行池化注意力及多特征融合增强目标检测方法  

Object detection method based on parallel pooling of attention and multi⁃feature fusion enhancement

在线阅读下载全文

作  者:程杰 卞长智 张婧 李小霞[1,2] 丁楠 CHENG Jie;BIAN Changzhi;ZHANG Jing;LI Xiaoxia;DING Nan(School of Information Engineering,Southwest University of Science and Technology,Mianyang 621010,China;Sichuan Industrial Autonomous and Controllable Artificial Intelligence Engineering Technology Research Center,Mianyang 621010,China;Mianyang Cigarette Factory,China Tobacco Sichuan Industrial Co.,Ltd.,Mianyang 621000,China)

机构地区:[1]西南科技大学信息工程学院,四川绵阳621010 [2]四川省工业自主可控人工智能工程技术研究中心,四川绵阳621010 [3]四川中烟工业有限责任公司绵阳卷烟厂,四川绵阳621000

出  处:《现代电子技术》2025年第5期59-67,共9页Modern Electronics Technique

基  金:国家自然科学基金面上项目(62071399);四川省科技计划重点研发项目(2023YFG0262,2023NSFSC1388)。

摘  要:针对通道注意力降维时导致细节信息损失和特征融合不充分的问题,提出一种并行池化注意力及多特征融合增强方法。首先,对输入图像使用两种池化模块并行处理,实现特征注意力增强。其中:熵引导池化模块利用通道信息熵生成特征权重系数,加强边缘纹理等细节信息;方向感知池化模块捕获图像在垂直和水平方向上的空间方向信息,再计算通道均值实现逐步降维保留关键特征。其次,多特征融合增强模块利用特征图尺度的对数函数自适应选取卷积核的大小,再将卷积后的特征分组重塑为与输入图像维度相同的通道、高度和宽度方向上的三个特征子图,并进行元素相乘获得增强特征图。最后,增强特征图与输入图像加权融合,同时增强目标的位置和细节信息。实验结果表明,文中方法在参数量不变的情况下,在VOC2007数据集上,mAP@0.5较YOLOX和YOLOv7分别提升4.62%、4.46%,在COCO数据集上,mAP@0.5较YOLOX和YOLOv7分别提升4.57%、4.63%。A parallel pooling of attention and multi-feature fusion enhancement(PPA-MfFE)method is proposed to get rid of the detail information loss and inadequate feature fusion caused by channel attention dimension reduction.Firstly,two pooling modules are used to process the input image in parallel to enhance the feature attention.In the entropy-guided pooling module,the channel information entropy is used to generate the feature weight coefficient and enhance the detailed information of edge texture.The directional awareness pooling module is responsible for capturing the spatial direction information of the image in both vertical and horizontal directions.And then,the channel mean is calculated,so as to achieve gradual dimensionality reduction and retain the key features.Secondly,the multi-feature fusion enhancement module is used to select the size of the convolution kernel adaptively by the logarithmic function of the feature graph scale,and reshape the convolution feature group into three feature subgraphs in the directions of channel,height and width with the same dimension as the input image,and then multiply the elements to obtain the enhanced feature graph.Finally,the enhanced feature graph is weighted and fused with the input image to enhance the location and detail information of the object.Experimental results show that,with the same number of parameters,the mAP@0.5 of the proposed algorithm is 4.62%and 4.46%higher than those of YOLOX and YOLOv7 in VOC2007 dataset,respectively,and its mAP@0.5 is 4.57%and 4.63%higher than those of YOLOX and YOLOv7 in COCO dataset,respectively.

关 键 词:通道注意力 降维 并行池化 多特征融合增强 自适应 目标检测 

分 类 号:TN911.73-34[电子电信—通信与信息系统] TP391.4[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象