基于深度学习的场景文字识别技术研究  

Research on Scene Text Recognition Technology Based on Deep Learning

在线阅读下载全文

作  者:陈志宇 司占军[2] 朱新雨 CHEN Zhi-yu;SI Zhan-jun;ZHU Xin-yu(College of Light Industry Science and Engineering,Tianjin University of Science and Technology,Tianjin 300457,China;College of Artificial Intelligence,Tianjin University of Science and Technology,Tianjin 300457,China)

机构地区:[1]天津科技大学轻工科学与工程学院,天津300457 [2]天津科技大学人工智能学院,天津300457

出  处:《印刷与数字媒体技术研究》2024年第3期237-243,291,共8页Printing and Digital Media Technology Study

摘  要:基于深度学习的场景文字识别技术(Scene Text Recognition,STR)应用广泛但性能尚需提升。针对现有的STR技术对小目标文字识别不准确和中文、中英文混合准确率低的问题,通过改进模型增加104×104的特征尺度,用Focal Loss和GIOU Loss作为损失函数来优化目标检测框,将卷积块注意力模块(Convolutional Block Attention Module,CBAM)嵌入到卷积层中,使网络在特定位置和通道上更加关注目标,抑制其余复杂背景信息以此来提高模型的文字检测能力;分析中文的文字特征,对CRNN的特征提取网络改进优化,提高了原有模型对中文、中英文混合识别的准确性。实验结果表明,通过对文字检测与识别模型和算法的改进优化,大大提高了场景文字识别技术的准确性和鲁棒性。Scene Text Recognition(STR)technology based on deep learning is widely used,but its performance should be further.To address the issues of inaccurate recognition of small text and low accuracy in recognizing mixed Chinese and English text in existing STR techniques,the model was enhanced by increasing the feature scale of 104×104.Additionally,the object detection boxes were optimized using Focal Loss and GIOU Loss as loss functions.The Convolutional Block Attention Module(CBAM)was also embedded into the convolutional layers.It can help the network focus more on the target at specific locations and channels while suppressing complex background information.These improvements collectively enhanced the text detection capabilities of the model.Furthermore,the textual features of Chinese text were analyzed and the feature extraction network of CRNN was improved to enhance the recognition accuracy of mixed Chinese and English text.Experimental results showed that the accuracy and robustness of scene text recognition technology have been significantly enhanced by improving and optimizing the text detection and recognition model and algorithms.

关 键 词:深度学习 场景文字识别技术 图像处理 目标检测 文字识别 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] TS801.3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象