检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘玉 郭迎春 朱叶 于明 LIU Yu;GUO Yingchun;ZHU Ye;YU Ming(School of Electronic and Information Engineering,Hebei University of Technology,Tianjin 300401,China;School of Artificial Intelligence and Data Science,Hebei University of Technology,Tianjin 300401,China)
机构地区:[1]河北工业大学电子信息工程学院,天津300401 [2]河北工业大学人工智能与数据科学学院,天津300401
出 处:《液晶与显示》2024年第11期1494-1505,共12页Chinese Journal of Liquid Crystals and Displays
基 金:国家自然科学基金青年项目(No.62102129);国家自然科学基金面上项目(No.62276088);河北省自然科学基金(No.F2021202030,No.F2019202381,No.F2019202464)。
摘 要:小样本图像语义分割只用少量样本就能分割出新类别。针对现有方法中语义信息挖掘不充分的问题,本文提出一种基于双交叉注意力网络的小样本图像语义分割方法。该方法采用Transformer结构,利用双交叉注意力模块同时从通道和空间维度上学习多尺度查询特征和支持特征的远程依赖性。首先,本文提出通道交叉注意力模块,并结合位置交叉注意力模块构成双交叉注意力模块。其中,通道交叉注意力模块用于学习查询和支持特征之间的通道语义相互关系,位置交叉注意力模块用来捕获查询和支持特征之间的远程上下文相关性。然后,通过多个双交叉注意力模块能够为查询图像提供包含丰富语义信息的多尺度交互特征。最后,本文引入辅助监督损失,并通过上采样和残差连接将多尺度交互特征连接至解码器以得到准确的新类分割结果。本文方法在数据集PASCAL-5i上的mIoU达到了69.9%(1-shot)和72.4%(5-shot),在数据集COCO-20i上的mIoU达到了48.9%(1-shot)和54.6%(5-shot)。与主流方法相比,本文方法的分割性能达到了最先进的水平。Few-shot semantic segmentation can segment novel classes with only few examples.To address the problem of insufficient semantic information mining in existing methods,a method based on Dual Cross-Attention Network for few-shot image semantic segmentation is proposed.The method adopts Transformer structure and uses dual cross-attention modules to explore the remote dependencies between multi-scale query and support features from both channel and spatial dimensions.Firstly,a channel cross-attention module is proposed in combination with the position cross-attention module to form a dual cross-attention module.Wherein,the channel cross-attention module is applied to learn the channel semantic interrelationships between the query and support features.The position cross-attention module is used to capture the remote contextual correlations between the query and support features.Then,multi-scale interaction features containing rich semantic information can be provided to the query image by multiple dual cross-attention modules.Finally,to obtain accurate segmentation results,auxiliary supervision loss is introduced,and these multi-scale interaction features are connected to the decoder via upsampled and residual connection.The proposed method achieves 69.9%(1-shot)and 72.4%(5-shot)mIoU on the dataset PASCAL-5i,and 48.9%(1-shot)and 54.6%(5-shot)mIoU on the dataset COCO-20i,which attains the state-of-the-art segmentation performance in comparison with mainstream methods.
关 键 词:小样本图像语义分割 Transformer结构 通道交叉注意力 双交叉注意力 辅助损失
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49