基于字符连接的场景文本检测

Scene Text Detection Based on the Connection of the Characters

作　　者：王良君季宇航顾维杰 WANG Liangjun;JI Yuhang;GU Weijie(School of Computer and Communication Engineering,Jiangsu University,Zhenjiang 212013)

机构地区：[1]江苏大学计算机科学与通信工程学院,镇江212013

出　　处：《计算机与数字工程》2024年第7期2108-2114,共7页Computer & Digital Engineering

基　　金：国家自然科学基金项目(编号:61601202);江苏省自然科学基金项目(编号:BK20140571);江苏大学高级专业人才科研启动基金项目(编号:14JDG038)资助。

摘　　要：近年来,场景文本检测的研究方向越发广泛,得益于深度卷积网络与图像分割技术的发展,场景文本检测器能够针对图像中任意形状的弯曲文本,生成多样的文本框。另外,场景图像中的文本时而表现出文字过小,长宽比过于极端等特点,这些实例在深度卷积和有限感受野的情况下,网络很容易丢失小文本的特征信息,获取不到长文本的完整特征。针对这两个难点,论文设计了基于字符连接的场景文本检测器,使用改进的AFF模块,将局部特征与全局特征融合起来,使网络对小文本目标更加敏感,避免小文本漏检的问题。网络输出字符区域与字符间隙得分,根据字符之间的连接属性连接文本行,使网络在有限感受野的情况下能够检测任意长文本。由于通用文本检测数据集缺少字符级的标注,论文使用弱监督学习策略来生成字符级伪标签,并制作了字符级的合成数据集来弥补弱监督学习的不足,使网络能够更好地学习场景文本的特征。实验结果表明,该方法在通用数据集ICDAR2015以及MSRA-TD500上均展现了优异的性能。In recent years,the research direction of scene text detection is more and more extensive.Thanks to the develop⁃ment of deep convolutional network and image segmentation technology,scene text detector can generate a variety of text boxes for the curved text of any shape in the image.In addition,the text in the scene image sometimes shows the characteristics of too small text,too extreme aspect ratio and so on.Under the circumstance of deep convolution and finite receptive field,the network is easy to lose the feature information of small text and cannot obtain the complete feature of long text.Aiming at these two difficulties,this pa⁃per designs a scene text detector based on character connection,and uses the improved AFF module to fuse local features with glob⁃al features to make the network more sensitive to small text targets and avoid the problem of small text missing detection.The net⁃work output character area and character gap are scored,and text lines are connected according to the connection property between characters,so that the network can detect arbitrary long text.Since the general text detection dataset lacks character-level annota⁃tions,weakly supervised learning strategy is used to generate character-level pseudo-labels,and character-level synthetic dataset is made to make up for the deficiency of weakly supervised learning,so that the network can better learn the features of scene text.Experimental results show that the method has excellent performance on general dataset ICDAR2015 and MSRA-TD500.

关键词：场景文本注意力特征融合弱监督学习字符连接

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于字符连接的场景文本检测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于字符连接的场景文本检测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索