融合注意力与序列单元的文本超分辨率

Text Super-Resolution Method with Attentional Mechanism and Sequential Units

作　　者：韦豪东易尧华余长慧[1] 林立宇[2] WEI Haodong;YI Yaohua;YU Changhui;LIN Liyu(School of Remote Sensing and Information Engineering,Wuhan University,Wuhan 430079,China;State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,Wuhan 430079,China)

机构地区：[1]武汉大学遥感信息工程学院,湖北武汉430079 [2]武汉大学测绘遥感信息工程国家重点实验室,湖北武汉430079

出　　处：《武汉大学学报（信息科学版）》2024年第7期1120-1129,共10页Geomatics and Information Science of Wuhan University

基　　金：国家重点研发计划(2021YFB2206200)。

摘　　要：街景影像中的文本信息是感知与理解场景的关键线索,低分辨率街景影像文本区域细节缺乏导致文本识别准确率降低。文本超分辨率通过增强文本区域边缘及纹理细节提高文本识别准确率,提出了融合注意力与序列单元的街景影像文本超分辨率方法。首先,采用混合残差注意力结构提取影像文本区域空间信息、通道信息并融合特征,序列单元通过双向门控循环结构提取影像中文本间的序列先验信息;然后利用梯度先验知识作为约束条件,重构街景影像文本区域。采用TextZoom真实场景影像及合成文本影像进行对比分析,实验结果表明,超分辨率重构的街景影像文本区域边缘清晰、纹理细节丰富,可以提高街景影像文本识别准确率。Objectives:The text in street view images is the clue to perceive and understand scene informa-tion.Low-resolution street view images lack details in the text region,leading to poor recognition accura-cy.Super-resolution can be introduced as pre-processing to reconstruct edge and texture details of the text region.To improve text recognition accuracy,we propose a text super-resolution network combining atten-tional mechanism and sequential units.Methods:A hybrid residual attentional structure is proposed to ex-tract spatial information and channel information of the image text region,learning multi-level feature repre-sentation.A sequential unit is proposed to extract sequential prior information between texts in the image through bidirectional gated recurrent units.Using gradient prior knowledge as the constraint,a gradient prior loss is designed to sharpen character boundaries.Results:In order to verify the effectiveness of the pro-posed method,we use real scene text images in TextZoom and synthetic text images to carry out compara-tive analysis experiments.Experimental results show that compared with the baseline and state-of-the-art general super-resolution algorithm,our model reconstruct sharper text edges and clearer texture details in visual perception,and achieve higher recognition accuracy.Conclusions:Our method can make better use of the prior knowledge of text areas in images,which help reconstruct text details,improving accuracy of the text recognition task.

关键词：街景影像超分辨率注意力机制序列信息梯度先验损失

分类号：P237[天文地球—摄影测量与遥感]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合注意力与序列单元的文本超分辨率

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合注意力与序列单元的文本超分辨率

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索