多域字符距离感知的场景文本图像超分辨率重建被引量：1

Scene Text Image Super-Resolution Reconstruction Based onPerceiving Multi-Domain Character Distance

作　　者：黄俊炀陈宏辉王嘉宝陈平平[1] 林志坚 HUANG Jun-yang;CHEN Hong-hui;WANG Jia-bao;CHEN Ping-ping;LIN Zhi-jian(College of Physics and Information Engineering,Fuzhou University,Fuzhou,Fujian 350108,China)

机构地区：[1]福州大学物理与信息工程学院,福建福州350108

出　　处：《电子学报》2024年第7期2262-2270,共9页Acta Electronica Sinica

基　　金：国家自然科学基金(No.62171135);福建省杰青项目(No.2022J06010);福建省教育厅重点攻关项目(No.2023XQ004);福州科技局项目(No.2023-P-001)~~。

摘　　要：场景文本图像超分辨率(Scene Text Image Super-Resolution, STISR)旨在提高文本在低分辨率图像中的分辨率和可读性.但是在空间变形或低分辨率的文本图像中,由于缺乏文本区域细节,语义线索和视觉特征信息难以与字符位置匹配对齐,文本识别效果不佳.针对该问题,本文提出多域字符距离感知的场景文本图像超高分辨率重建方法(Perceiving Multi-Domain Character distance super-resolution, PMDC),强化视觉语义特征,提高文本区域和纹理信息.首先,采用非对称卷积以及语义先验信息模块,提取文本图像的视觉和语义特征信息;其次,融合字符距离感知模块中的视觉和语义特征,得到增强位置编码感知字符间的间距变化和语义相似性;最后,结合引导线索和视觉特征对像素进行重组得到超分辨率文本图像.在公开数据集TextZoom上的实验结果,与最近TATT文本超分网络性能相比,在峰值信噪比指标上提高0.11 dB,有效提高文本清晰度和边缘纹理细节,同时提升1.5%的平均识别准确率,改进文本图像的可读性.Scene text image super-resolution(STISR)aims to enhance the resolution and legibility of text in low-reso⁃lution images.In cases of spatial deformation or low-resolution text images,the lack of details in text regions and the diffi⁃culty in aligning semantic cues and visual features with character position make it difficult to recognize text effectively.In order to address these challenges,this paper proposes a perceiving multi-domain character distance for scene text image su⁃per-resolution method(PMDC),which improves the image text region and edge texture details.Firsly,the visual and seman⁃tic features are extracted by using the asymmetric convolution module along with the semantic prior module.Then the en⁃hanced position coding is obtained by the character distance perception module to perceive the distance change and seman⁃tic similarity between characters.Finally,the guiding cues and visual features are combined to restructure the pixels and generate a super-resolution text image.In comparison to TATT,experimental results from the public dataset TextZoom showed an increase of 0.11 dB in the fidelity of the peak signal-to-noise ratio index.This improvement effectively enhances the clarity of the text area and the detailed edge texture.Additionally,the recognition accuracy was improved by 1.4%,which effectively enhances the readability of the text image.

关键词：计算机视觉场景文本图像超分辨率注意力机制特征信息关联

分类号：TN911.73[电子电信—通信与信息系统] TP391.43[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

多域字符距离感知的场景文本图像超分辨率重建被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

多域字符距离感知的场景文本图像超分辨率重建 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

多域字符距离感知的场景文本图像超分辨率重建被引量：1