全卷积注意力机制神经网络的图像语义分割  被引量:16

Fully Convolutional Neural Network with Attention Module for Semantic Segmentation

在线阅读下载全文

作  者:欧阳柳 贺禧 瞿绍军 OU Yangliu;HE Xi;QU Shaojun(College of Information Science and Engineering,Hunan Normal University,Changsha 410081,China;Hunan Xiangjiang Artificial Intelligence Academy,Hunan Normal University,Changsha 410081,China)

机构地区:[1]湖南师范大学信息科学与工程学院,长沙410081 [2]湖南师范大学湖南湘江人工智能学院,长沙410081

出  处:《计算机科学与探索》2022年第5期1136-1145,共10页Journal of Frontiers of Computer Science and Technology

基  金:国家自然科学基金(12071126);湖南省教育厅科学研究项目(19C1149);国家级大学生创新创业训练计划项目(S202010542021)。

摘  要:全卷积神经网络是一种强大的端到端的模型,在语义分割领域应用广泛,获得了巨大的成功。研究人员提出了一系列基于完全卷积神经网络的方法,但是随着卷积和池化的持续性下采样,图像的上下文信息将会丢失,影响了像素级分类。针对完全卷积网络上下文信息丢失问题,提出基于像素的注意力方法。该方法利用计算高级特征图像素之间的联系来获取全局信息,增强像素之间的相关性,再结合空洞空间金字塔池化进一步提取图像的特征信息。针对图像的高层特征图像素丢失的问题,提出了基于图像不同层级的注意力方法。该方法将高层特征图中的信息作为指导对低层特征图中隐藏的信息进行挖掘,然后和高级特征图进行融合,充分利用高级特征图信息和低级特征图的信息。在实验方面,通过对比所提不同模块对全卷积神经网络分割性能的影响,验证了所提方法的有效性。同时在公认的图像语义分割数据集Cityscapes上与当前先进的网络进行实验对比,结果显示所提方法在客观评价指标和主观效果方面均存在优越性,并在Cityscapes官网测试集中达到了69.3%的准确率,性能比近期几个先进网络高出3~5个百分点。A fully convolutional neural network is a powerful end-to-end model that is widely used in the field of semantic segmentation and has achieved great success.Researchers have proposed a series of methods based on a fully convolutional neural network.However,with the continuous subsampling of convolutions and pooling,the image contextual information will be lost,affecting the pixel-level classification.To solve the problem of context loss in a fully convolutional network,a pixel-based attention method is proposed,which calculates the relationship bet ween high-level feature map pixels to obtain global information and enhance the correlation between pixels com bined with atrous spatial pyramid pooling to further extract the image feature information.To solve the problem of pixel loss in the high-level feature map of an image,an attention method based on different levels of the image is proposed.This method uses the information in the high-level feature map as a guide to mine the hidden information in the low-level feature map and then fuses it with the high-level feature map to make full use of the high-level feature map and the low-level feature map information.In the experiment,the effectiveness of the proposed method is verified by comparing the effects of different modules on the segmentation results of a fully convolutional neural network.At the same time,experiments are carried out on the recognized image semantic segmentation dataset called Cityscapes and compared with the current advanced networks.The results show that the proposed method has advantages in both objective evaluation indicators and subjective effects,and achieves 69.3%accuracy in the Cityscapes official website test set.The performance is 3 to 5 percentage points higher than that of several recent advanced networks.

关 键 词:全卷积神经网络 空洞空间金字塔池化 注意力模型 语义分割 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象