基于多尺度特征融合的人脸照片–素描合成  被引量:1

Face photo-sketch synthesis based on multi-scale feature fusion

在线阅读下载全文

作  者:梁昌城 王楠楠[1] 朱明瑞 杨曦 李洁 高新波 Changcheng LIANG;Nannan WANG;Mingrui ZHU;Xi YANG;Jie LI;Xinbo GAO(State Key Laboratory of Integrated Services Networks,Xidian University,Xi’an 710071,China;Chongqing Key Laboratory of Image Cognition,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)

机构地区:[1]西安电子科技大学综合业务网理论及关键技术国家重点实验室,西安710071 [2]重庆邮电大学重庆市图像认知重点实验室,重庆400065

出  处:《中国科学:信息科学》2022年第2期334-347,共14页Scientia Sinica(Informationis)

摘  要:从真实的人脸照片合成面部素描及其逆过程具有广泛的用途,例如数字娱乐与协助刑事案件的侦查.但是,由于照片与素描在纹理上的显著差异,它们之间的互相转换仍是一个具有挑战性的问题.最近基于生成对抗网络的方法已在图像间转换问题,特别是照片到素描的转换方面展现出令人鼓舞的结果,但它们大多会在面部关键组件产生不同的形变或者模糊,使得合成图像的真实性受影响.为了应对这一挑战,我们提出了一种新颖的基于多尺度特征融合的人脸照片–素描合成算法,来提高合成图像的结构完整性与纹理逼真度.首先使用编码器提取输入图像的多尺度编码特征,然后将最底层编码特征经过空洞卷积模块后传入解码器进行解码.解码过程中将不同尺度的解码特征与对应尺度的编码特征在通道维度上拼接,从而获得多尺度编解码融合特征.最后在解码器的输出端将不同尺度的编解码融合特征进一步融合,并通过一层卷积层产生最终合成结果.通过这种同时将编码–解码过程中不同尺度的特征在通道维度进行拼接的方式,能够保持较好的图像结构以及纹理细节,生成逼真的面部素描/照片图像.我们在多个具有挑战性的数据集中验证了所提方法的有效性.定量和定性评估表明,本文模型在生成具有高视觉质量的人脸素描(或照片)方面优于其他最新技术.The synthesis of face sketches from real face photos and its inverse process have a wide range of applications,such as digital entertainment and assisting in the investigation of criminal cases.However,due to the significant discrepancies in texture between photos and sketches,the transformation between them is still a challenging problem.Recently,the methods based on generative adversarial nets(GANs)have shown encouraging results in the transformation between images,especially the transformation from photos to sketches.However,most of them produce different deformations or blur in the key components of the face,which affects the realism of the synthetic image.To address this challenge,we propose a novel face photo-sketch synthesis algorithm based on multi-scale feature fusion to improve the structural integrity and texture fidelity of the synthetic image.Firstly,an encoder is utilized to extract the multi-scale coding features of the input image.Then the bottom coding features are passed by a dilated convolution module and transmitted to the decoder for decoding.In the decoding process,the decoding features of different scales and the coding features of corresponding scales are concatenated in the channel dimension to obtain the multi-scale coding and decoding fusion features.Finally,the coding and decoding fusion features of different scales are further fused at the output of the decoder and the final synthesis result is generated through a convolution layer.By combining the features of different scales in the encoding and decoding processes in the channel dimension at the same time,we can maintain good image structure and texture details and generate realistic face sketch/photo images.We verify the effectiveness of the proposed method in several challenging datasets.The quantitative and qualitative evaluation shows that the proposed method is superior to other state-of-the-art methods in generating face sketches(or photos)with high visual quality.

关 键 词:人脸照片–素描合成 图像翻译 生成对抗网络 多尺度特征融合 空洞卷积 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象