基于语义信息引导的图像协调化

Image Harmonization Guided by Semantic Information

作　　者：杨紫媛李鹏程刘芳岑高陈强[1,2] YANG Zi-yuan;LI Peng-cheng;LIU Fang-cen;GAO Chen-qiang(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Chongqing Key Laboratory of Signal and Information Processing,Chongqing 400065,China)

机构地区：[1]重庆邮电大学通信与信息工程学院,重庆400065 [2]信号与信息处理重庆市重点实验室,重庆400065

出　　处：《电子学报》2023年第7期1826-1834,共9页Acta Electronica Sinica

基　　金：国家自然科学基金(No.62176035)。

摘　　要：图像协调化在图像处理中占据着一个重要的地位,它旨在调整前景外观(如光照、颜色、纹理等)使其与背景在视觉上保持一致.然而,现有的基于深度学习方法通常将图像整体背景的特征分布作为线索来调整前景,没有注重语义信息对前景调整的关键作用,导致前景的局部区域与背景在视觉上出现差异.为此,本文基于多分辨率选择融合模块(Multi-Resolution Selective Fusion Module,MRSFM)和轻量级的卷积块注意力模块(Convolutional Block Attention Module,CBAM),设计了一个基于双注意力机制的多分辨率选择融合模块(Multi-Resolution Selective Fusion module based on Dual Attention Mechanism,MRSF-DAM),使得最后输出的特征图具有丰富的语义信息,从而引导网络更好地理解图像前景与它周围场景之间的相关性,使网络更加充分地从背景中获取协调前景所需的各种信息,最终缩小图像前景区域和背景区域在视觉上的外观差异.此外,本文设计了一个新的网络架构来选择融合浅层和深层的特征信息,通过对解码器前6层网络层与MRSF-DAM的输出特征图进行多尺度融合和增强,将产生的增强特征图送入解码器的最后层,能够缓解由跳跃连接引入的与前景内容的特征不相关的问题,且减少了由于解码器经过多次下采样带来的空间特征信息损失,进一步提高生成协调图像的真实性.在广泛使用的iHarmony4基准数据集上进行了大量的实验验证了本文方法的有效性.相比于目前最新的方法 SCS-Co(Self-Consistent Style Contrastive learning for image harmonization),本文方法在整个数据集的均方误差(Mean Squared Error,MSE)、前景均方误差(foreground Mean Squared Error,fMSE)和峰值信噪比(Peak Signal-to-Noise Ratio,PSNR)上分别提升了4.28,61.97和1 dB.Image harmonization occupies an important position in image processing.It aims to adjust the foreground appearance,e.g.,illumination,color,texture,etc.,to be visually consistent with the background.However,existing deep learning-based methods usually use the feature distribution of the overall image background as a cue to adjust the fore⁃ground,without focusing on the critical role of semantic information for foreground alignment,resulting in local areas in the foreground to appear visually different from the background.To this end,based on the multi-resolution selective fusion module(MRSFM)and the lightweight convolutional block attention module(CBAM),this paper designs a multi-resolution selective fusion module based on dual attention mechanism(MRSF-DAM),which makes the final output feature map rich in semantic information,thus guiding the network to better understand the correlation between the foreground of an image and its surrounding scene,more enabling the network to fully obtain the various information needed to coordinate the fore⁃ground from the background,and eventually reducing the visual discrepancy between the foreground and background re⁃gions of an image.In addition,this article designs a new network architecture to selectively fuse the shallow and deep fea⁃ture information.By multi-scale fusion and enhancement of the output feature maps of the first six network layers of the de⁃coder and MRSF-DAM,the generated enhanced feature maps are fed into the final layer of the decoder,which can alleviate the problem introduced by skip connections of the unrelated features to the foreground,and besides,it reduces the loss of spatial feature information caused by multiple downsampling of the decoder,further improving the authenticity of the gener⁃ated harmonized images.A large number of experiments were conducted on the widely used iHarmony4 benchmark dataset to verify the effectiveness of our method.Compared to the latest method SCS Co(Self Consistent Style Comparative learn⁃ing for image harm

关键词：图像协调化图像处理语义信息局部背景信息多分辨率选择融合空间特征信息

分类号：TN911.73[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于语义信息引导的图像协调化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于语义信息引导的图像协调化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索