图像复原中自注意力和卷积的动态关联学习  被引量:1

Dynamic association learning of self-attention and convolution in image restoration

在线阅读下载全文

作  者:江奎 贾雪梅 黄文心 王文兵 王正 江俊君 Jiang Kui;Jia Xuemei;Huang Wenxin;Wang Wenbing;Wang Zheng;Jiang Junjun(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150000,China;School of Computer Science,Wuhan University,Wuhan 430072,China;School of Computer Science and Information Engineering,Hubei University,Wuhan 430062,China;Hangzhou Lingban Technology Ltd.,Hangzhou 310000,China)

机构地区:[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150000 [2]武汉大学计算机学院,武汉430072 [3]湖北大学计算机与信息工程学院,武汉430062 [4]杭州灵伴科技有限公司,杭州310000

出  处:《中国图象图形学报》2024年第4期890-907,共18页Journal of Image and Graphics

基  金:国家自然科学基金项目(6230010538,22301213);湖北省重点研发基金(2021YFC3320301);中国人工智能学会—华为MindSpore开放基金项目(2022BAA033)。

摘  要:目的 卷积神经网络(convolutional neural network, CNN)和自注意力(self-attention, SA)在多媒体应用领域已经取得了巨大的成功。然而,鲜有研究人员能够在图像修复任务中有效地协调这两种架构。针对这两种架构各自的优缺点,提出了一种关联学习的方式以综合利用两种方法的优点并抑制各自的不足,实现高质高效的图像修复。方法 本文结合CNN和SA两种架构的优势,尤其是在特定的局部上下文和全局结构表示中充分利用CNN的局部感知和平移不变性,以及SA的全局聚合能力。此外,图像的降质分布揭示了图像空间中退化的位置和程度。受此启发,本文在背景修复中引入退化先验,并据此提出一种动态关联学习的图像修复方法。核心是一个新的多输入注意力模块,将降质扰动的消除和背景修复关联起来。通过结合深度可分离卷积,利用CNN和SA两种架构的优势实现高效率和高质量图像修复。结果 在Test1200数据集中进行了消融实验以验证算法各个部分的有效性,实验结果证明CNN和SA的融合可以有效提升模型的表达能力;同时,降质扰动的消除和背景修复关联学习可以有效提升整体的修复效果。本文方法在3个图像修复任务的合成和真实数据上与其他10余种方法进行了比较,提出的方法取得了显著的提升。在图像去雨任务上,本文提出的ELF(image deeraining meets association learning and Transformer)方法在合成数据集Test1200上,相比于MPRNet(multi-stage progressive image restoration network),PSNR(peak signal-to-noise ratio)值提高了0.9 dB;在水下图像增强任务上,ELF在R90数据集上超过Ucolor方法 4.15 dB;在低照度图像增强任务上,相对于LLFlow(flow-based low-light image enhancement)算法,ELF获得了1.09 dB的提升。结论 本文方法在效果和性能上具有优势,在常见的图像去雨、低照度图像增强和水下图像修复等任务上优于代表性的方法。Objective Convolutional neural networks(CNNs)and self-attention(SA)have achieved great success in the field of multimedia applications for dynamic association learning of SA and convolution in image restoration.However,owing to the intrinsic characteristics of local connectivity and translation equivariance,CNNs have at least two shortcom-ings,1)limited receptive field and 2)static weight of sliding window at inference,unable to cope with content diversity.The former prevents the network from capturing long-range pixel dependencies,while the latter sacrifices the adaptability to input contents.As a result,they are far from meeting the requirement in modeling global rain distribution and generate results with obvious rain residue.Meanwhile,because of the global calculation of SA,its computational complexity grows quadratically with the spatial resolution,making it infeasible to apply to high-resolution images.In view of the advantages and disadvantages of these two architectures,this study proposes an association learning method to utilize the advantages of the two methods comprehensively and suppress their respective shortcomings to achieve high-quality and efficient inpaint-ing.Method This study combines the advantages of CNN and SA architectures,particularly by fully utilizing CNNs'local perception and translation invariance in specific local context and global structural representations,as well as SA's global aggregation ability.We take inspiration from the observation that rain distribution reflects the degradation location and degree,in addition to rain distribution prediction.Therefore,we propose to refine background textures with the predicted degradation prior in an association learning manner.We accomplish image deraining by associating rain streak removal and background recovery,in which an image deraining network and a background recovery network are specifically designed for these two subtasks.The key part of association learning is a novel multi-input attention module(MAM).It generates the degradatio

关 键 词:图像修复 关联学习 自注意力(SA) 图像去雨 低照度图像增强 水下图像修复 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象