基于文本三区域分割的场景文本检测方法被引量：9

Scene Text Detection Based on Triple Segmentation

作　　者：李煌王晓莉项欣光 LI Huang;WANG Xiao-li;XIANG Xin-guang(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China)

机构地区：[1]南京理工大学计算机科学与工程学院,南京210094

出　　处：《计算机科学》2020年第11期142-147,共6页Computer Science

摘　　要：随着卷积神经网络的发展,场景文本检测也得到了快速发展。然而,场景文本检测仍然存在很多问题:一方面,许多检测方法都采用矩形框作为检测框,这对于图像中不规则的文本是不友好的;另一方面,部分方法获取的检测框无法分离相邻的文本实例,从而导致图像中相邻文本的误检测。为了解决这两个问题,文中提出了一种基于文本三区域分割的场景文本检测方法,将图像的文本实例分别映射到整体区域、核心区域和边框区域空间中,以获取图像的文本实例在上述3个区域的分割图,然后利用整体区域分割图和边框区域分割图来指导核心区域分割图的生成。文本的核心区域虽包含了图像中的文本位置、大小等信息,但是缺少边界信息。为了获取更加精确的检测结果,所提方法利用文本的边框区域来对核心区域进行监督学习。最后将基于文本的核心区域分割图像,产生契合文本核心的外接多边形,并进行一定比例的扩张,获取检测结果。实验结果表明,所提方法在ICDAR2015数据集上的准确率可达到83%,与现有的检测算法相比,其F值获得了1%以上的提升,而且该算法在弯曲文本的检测上亦有着优异的表现。Scene text detection has been developed rapidly with the development of convolutional neural network.However,there still exists some challenges.On the one hand,many detection algorithms use rectangular box as the detection box,which is inaccurate to locate the irregular texts.On the other hand,some methods may get the bounding boxes but fail to separate text instances that lie very close to each other,causing error detection.To solve these two problems,this paper proposes a novel triple segmentation(TS),text instances in image are mapped to score area,kernel area and threshold area,which generate three segmentation maps,the score map and threshold map are used to guide the generation of kernel map.Although kernel map has the information of texts in image,such as location,size and so on,it lacks the threshold information.In order to get a better result,this method uses threshold map to restrict the generation of kernel map.The detection result is based on instance segmentation to get the bounding polygon of text kernel instance,and then make an expansion.This algorithm achieves a precision of 83%on ICDAR2015 dataset,which outperforms the existing methods by more than 1%on F-measure,which proves this method is also effective to detect curve texts.

关键词：场景文本检测神经网络实例分割深度学习计算机视觉

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于文本三区域分割的场景文本检测方法被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于文本三区域分割的场景文本检测方法 被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于文本三区域分割的场景文本检测方法被引量：9