消除背景噪声增强字符形状特征的场景文字识别

Scene Text Recognition with Eliminating Background Noise and Enhancing Characters Shape Feature

作　　者：唐善成[1] 梁少君鲁彪张莹金子成逯建辉 Tang Shancheng;Liang Shaojun;Lu Biao;Zhang Ying;Jin Zicheng;Lu Jianhui(School of Communication and Information Engineering,Xi’an University of Science and Technology,Xi’an 710600)

机构地区：[1]西安科技大学通信与信息工程学院,西安710600

出　　处：《计算机辅助设计与图形学学报》2024年第6期875-883,共9页Journal of Computer-Aided Design & Computer Graphics

基　　金：国家重点研发计划(2018YFC0808300);陕西省科技计划重点产业创新链(群)项目(2020ZDLGY15-07);西安市科技计划科技创新引导项目(201805036YD14CG20(4))。

摘　　要：为了解决现有方法未有效地消除背景噪声和字符自身噪声干扰的问题,提出一种包含3个模块的消除背景噪声增强字符形状特征(EBEC)的文字识别模型.空间注意力机制增强的EBEC网络只关注字符区域特征,以消除背景噪声,迫使网络仅学习字符形状特征,增强字符形状特征;特征提取模块采用EfficientNet-B3作为主干网络提取特征图;基元表征学习模块学习特征图得到视觉文字表征,通过对视觉文字表征解码得到识别结果.实验结果表明,与经典模型相比,所提模型在合成场景数据集上识别准确率提高9.76个百分点,在公开数据集IIIT5K,ICDAR-2003,ICDAR-2015,CUTE80上识别准确率平均提高2.91个百分点;该模型可有效地消除背景噪声和字符自身噪声,提高识别性能.A text recognition model that eliminates background noise and enhances the shape features of characters was proposed to solve the problem that the existing methods cannot effectively eliminate the background noise and there is noise interference of the characters themselves.The model consisted of three modules.The EBEC network enhanced by the spatial attention mechanism only paid attention to character region features,eliminated background noise,and forced the network to learn only the character shape features to enhance the character shape features;the feature extraction module extracted feature maps by using Efficient-Net-B3 as the backbone network;the primitive representation learning module learned the feature map to obtain the visual text representation and then acquired the recognition result by decoding the visual text representation.The experimental results show that the proposed model improves the recognition accuracy by 9.76 percentage points over the classical model on the synthetic scene dataset and by 2.91 percentage points on average on the public datasets IIIT5K,ICDAR-2003,ICDAR-2015,CUTE80.Therefore,the model can not only effectively eliminate background noise and character noise,but also improve recognition performance.

关键词：场景文字识别空间注意力机制背景噪声字符自身噪声

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

消除背景噪声增强字符形状特征的场景文字识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

消除背景噪声增强字符形状特征的场景文字识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索