检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:林泓[1] 卢瑶瑶 LIN Hong;LU Yao-yao(College of Computer Science and Technology,Wuhan University of Technology,Wuhan 430063,China)
机构地区:[1]武汉理工大学计算机科学与技术学院
出 处:《浙江大学学报(工学版)》2019年第8期1506-1516,共11页Journal of Zhejiang University:Engineering Science
摘 要:针对卷积神经网络中间特征层信息利用不充分,以及不区分尺度和难易样本的学习所导致的文字检测精度难以提高的问题,提出基于多路精细化特征融合的聚焦难样本的区分尺度的自然场景文字检测方法.构建多路精细化的卷积神经网络融合层提取高分辨率特征图;按照文字标注矩形框的较长边的尺寸,将文字实例划分为3种尺度范围,并分布到不同的候选框提取网络中提取相应的候选框;设计聚焦损失函数对难样本进行重点学习以提高模型的表达能力并得到目标文字框.实验表明,所提出的多路精细化特征提取方法在COCO-Text数据集上的文字召回率较高,聚焦难样本的区分尺度的文字检测方法在ICDAR2013、ICDAR2015标准数据集上的检测精度分别为0.89、0.83,与CTPN、RRPN等方法相比,在多尺度多方向的自然场景图像中具有更强的鲁棒性.The accuracy of text detection is difficult to improve due to the inadequate utilization of the information in middle feature layers of convolutional neural networks and the learning without distinction of different scales and hard-easy examples.Aiming at this problem,a text detection method for natural scene images based on multichannel refined feature fusion was proposed,which focused on hard examples and could distinguish different scales.The fusion layers of multi-channel refined convolutional neural network were constructed to extract high resolution feature maps.According to the size of the longer side of text label rectangle boxes,the text instances were divided into three scale ranges,and distributed to different proposal networks to extract corresponding proposals.The focal loss function was designed to focus on learning hard examples to improve the expressive ability of the model and obtain the target text bounding boxes.Experiments showed that the text recall of the proposed multi-channel refined feature extraction method on COCO-Text datasets was high.The detection accuracies of the differentiated-scale text detection method focusing on hard examples on ICDAR2013 and ICDAR2015 standard datasets were 0.89 and 0.83,respectively.Compared with CTPN and RRPN,the proposed method has stronger robustness in multi-scale and multi-orientation natural scene images.
关 键 词:深度学习 自然场景 文字检测 特征融合 难样本 聚焦损失
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.227