机构地区:[1]中国科学院大学人工智能技术学院,北京100049 [2]中国科学院自动化研究所模式识别国家重点实验室,北京100190 [3]清华大学水沙科学水利水电工程国家重点实验室,北京100084
出 处:《中国图象图形学报》2022年第4期1264-1276,共13页Journal of Image and Graphics
基 金:国家重点研发计划资助(2019YFB2204104);国家自然科学基金项目(61772523);清华大学水沙科学水利水电工程国家重点实验室及宁夏银川水联网数字治水联合研究院联合开放研究基金项目(sklhse-2019-Iow04)。
摘 要:目的图像文本信息在日常生活中无处不在,其在传递信息的同时,也带来了信息泄露问题,而图像文字去除算法很好地解决了这个问题,但存在文字去除不干净以及文字去除后的区域填充结果视觉感受不佳等问题。为此,本文提出了一种基于门循环单元(gate recurrent unit,GRU)的图像文字去除模型,可以高质量和高效地去除图像中的文字。方法通过由门循环单元组成的笔画级二值掩膜检测模块精确地获得输入图像的笔画级二值掩膜;将得到的笔画级二值掩膜作为辅助信息,输入到基于生成对抗网络的文字去除模块中进行文字的去除和背景颜色的回填,并使用本文提出的文字损失函数和亮度损失函数提升文字去除的效果,以实现对文字高质量去除,同时使用逆残差块代替普通卷积,以实现高效率的文字去除。结果在1080组通过人工处理得到的真实数据集和使用文字合成方法合成的1000组合成数据集上,与其他3种文字去除方法进行了对比实验,实验结果表明,在峰值信噪比和结构相似性等图像质量指标以及视觉效果上,本文方法均取得了更好的性能。结论本文提出的基于门循环单元的图像文字去除模型,与对比方法相比,不仅能够有效解决图像文字去除不干净以及文字去除后的区域与背景不一致问题,并能有效地减少模型的参数量和计算量,最终整体计算量降低了72.0%。Objective The textual information in digital images is ubiquitous in our daily life.However,while it delivers valuable information,it also runs the risk of leaking private information.For example,when taking photos or collecting data,some private information will inevitably appear in the images,such as phone numbers.Image text removal technology can protect privacy by removing sensitive information in the images.At the same time,this technology can also be widely used in image and video editing,text translation,and other related tasks.Tursun et al.added a binary mask as auxiliary information to make the model focus on the text area,which has made obvious improvements compared with the existing scene text removal methods.However,this binary mask is redundant because it covers a large amount of background information between text strokes,which means the removed area(indicating by binary mask)is larger than what needs to be removed(i.e.,text strokes),and this limitation can be improved further.Considering the problems of unclean text removal in existing text removal methods and poor visual perception after text removal,we propose a gate recurrent unit(GRU)-based generative adversarial network(GAN)framework to effectively remove the text and obtain high-quality results.Method Our framework is fully“end-to-end”.We first take the image with text as input and the binary mask of the corresponding text area,the stroke-level binary mask of the input image can be accurately obtained through our designed detection module composed of multiple GRUs.Then,the GAN-based text removal module combines input image,text area mask,and stroke-level mask to remove the text in the image.Meanwhile,we propose the brightness loss function to further improve visual quality based on the observation that human eyes are more sensitive to changes in the brightness of the image.Specifically,we transfer the output image from the RGB space to the YCr Cb color space and minimize the difference in the brightness channel of the output image and gro
关 键 词:文字去除 门循环单元(GRU) 生成对抗网络(GAN) 逆残差块 图像修复
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...