多尺度特征融合与交叉指导的小样本语义分割  被引量:1

Multiscale feature fusion and cross-guidance for few-shot semantic segmentation

在线阅读下载全文

作  者:郭婧[1] 王飞 Guo Jing;Wang Fei(Department of Electronic Information,Jingzhong Vocational and Technical College,Jinzhong 030600,China;Intelligent and Available Systems Centre,University of Brighton,Brighton BN24AT,UK)

机构地区:[1]晋中职业技术学院电子信息系,晋中030600 [2]英国布莱顿大学智能和可用系统中心,英国布莱顿BN24AT

出  处:《中国图象图形学报》2024年第5期1265-1276,共12页Journal of Image and Graphics

基  金:山西高等学校科技创新计划项目(202201D121009)。

摘  要:目的构建支持分支和查询分支间的信息交互对于提升小样本语义分割的性能具有重要作用,提出一种多尺度特征融合与交叉指导的小样本语义分割算法。方法利用一组共享权重的主干网络将双分支输入图像映射到深度特征空间,并将输出的低层、中间层和高层特征进行尺度融合,构造多尺度特征;借助支持分支的掩码将支持特征分解成目标前景和背景特征图;设计了一种特征交互模块,在支持分支的目标前景和整个查询分支的特征图上建立信息交互,增强任务相关特征的表达能力,并利用掩码平均池化策略生成目标前景和背景区域的原型集;利用无参数的度量方法分别计算支持特征和原型集、查询特征与原型集之间的余弦相似度值,并根据相似度值给出对应图像的掩码。结果通过在PASCAL-5^(i)(pattern analysis,statistical modeling and computational learning)和COCO-20^(i)(common objects in context)开源数据集上进行实验,结果表明,利用VGG-16(Visual Geometry Group)、ResNet-50(residual neural network)和ResNet-101作为主干网络时,所提模型在1-way 1-shot任务中,分别获得50.2%、53.2%、57.1%和23.9%、35.1%、36.4%的平均交并比(meanintersectionoverunion,mIoU),68.3%、69.4%、72.3%/和60.1%、62.4%、64.1%的前景背景二分类交并比(foreground and background intersection over union,FB-IoU);在1-way 5-shot任务上,分别获得52.9%、55.7%、59.7%和32.5%、37.3%、38.3%的mIoU,69.7%、72.5%、74.6%和64.2%、66.2%、66.7%的FB-IoU。结论相比当前主流的小样本语义分割模型,所提模型在1-way 1-shot和1-way5-shot任务中可以获得更高的mIoU和FB-IoU,综合性能提升效果显著。Objective Few-shot semantic segmentation is one of the fundamental and challenging tasks in the field of com⁃puter vision.It aims to use a limited amount of annotated support samples to guide the segmentation of unknown objects in a query image.Compared with traditional semantic segmentation,few-shot semantic segmentation methods effectively alle⁃viate problems,such as the high cost of per-pixel annotation greatly limiting the application of semantic segmentation tech⁃nology in practical scenarios and the weak generalization ability of this model for novel class targets.The existing few-shot semantic segmentation methods mainly utilize the meta-learning architecture with dual-branch networks, where the supportbranch consists of the support images and their corresponding per-pixel labeled ground truth masks, and the query branchtakes the input of the new image to be segmented, and both branches share the same semantic classes. The valuable infor⁃mation of support images in the support branch is extracted to guide the segmentation of unknown novel classes in queryimages. However, different instances of the same semantic class may have variations in appearance and scale, and theinformation extracted solely from the support branch is insufficient to guide the segmentation of unknown novel classes inquery images. Although some researchers have attempted to improve the performance of few-shot semantic segmentationthrough bidirectional guidance, existing bidirectional guidance models overly rely on the pseudo masks predicted by thequery branch in the intermediate stage. If the initial predictions of the query branch are poor, it can easily lead to a weakgeneralization of shared semantics, which is not conducive to improving segmentation performance. Method A multiscalefeature fusion and cross-guidance network for few-shot semantic segmentation is proposed to alleviate these problems,attempting to construct the information interaction between the support branch and the query branch to improve the perfor⁃mance o

关 键 词:小样本语义分割 多尺度特征融合 跨分支交叉指导 特征交互 掩码平均池化 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象