检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张博[1] 徐彦彦[1] 王志恒 闫悦菁 ZHANG Bo;XU Yanyan;WANG Zhiheng;YAN Yuejing(State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing(Wuhan University),Wuhan Hubei 430072,China)
机构地区:[1]测绘遥感信息工程国家重点实验室(武汉大学),武汉430072
出 处:《计算机应用》2023年第S02期9-17,共9页journal of Computer Applications
基 金:国家重点研发计划项目(2021YFB2501103);国家自然科学基金资助项目(42271431)。
摘 要:针对自然场景文本检测中由于文本实例尺度多样、文本内部字符空隙大以及形状不规则造成的检测难题,提出一种面向自然场景的不规则文本检测方法。以稀疏区域卷积神经网络(Sparse R-CNN)作为基础检测框架,首先,在特征提取阶段使用特征金字塔网络(FPN)将不同阶段特征进行融合得到多尺度特征金字塔,提取更深层次、更多尺度的文本特征,使得网络能够检测各种尺度的文本;然后,引入文本内协同学习(Intra-CL)模块,通过多种感受野的卷积级联组合协同采样文本实例的字符区域和空隙区域的特征,完整地表示文本实例的特征,来缓解文本断裂检测问题;最后,引入动态掩码头(DynMH),通过与检测头的交互,充分学习到不同层级的文本特征,并对文本区域进行实例分割,生成精细的文本轮廓,实现任意形状的文本检测。在标准数据集ICDAR2015、TotalText和CTW1500上对模型进行评估,实验结果表明,在ICDAR2015上该方法的文本检测的综合指标F值相较于TextSnake提升了2.3个百分点,在TotalText上比SegLink++提升了2.3个百分点,在CTW1500上比TextField提升了1.2个百分点。可视化结果表明,该方法能够准确地定位文本区域,更好地分割文本边界。In response to the detection challenges posed by the diversity in text instance scale,large internal character gaps,and irregular shapes in natural scene text detection,an irregular text detection method for natural scenes was proposed.Sparse Region-based Convolutional Neural Network(Sparse R-CNN)was employed as the basic detection framework.Initially,Feature Pyramid Network(FPN)was utilized during the feature extraction stage to fuse features from different stages,forming a multi-scale feature pyramid.This process gleaned deeper and more multi-scale text features,enabling the network to detect texts of various scales.Subsequently,the Intra-instance Collaborative Learning(Intra-CL)module was incorporated to collaboratively sample features from character regions and gap regions within text instances through a combination of convolutions with multiple receptive fields.This action completely represented the features of the text instances,thus alleviating the issue of broken text detection.Finally,Dynamic Mask Head(DynMH)was integrated.Through interaction with the detection head,it fully learned text features at different levels and performed instance segmentation on the text areas,generating delicate text contours and achieving text detection of arbitrary shapes.The model was evaluated on standard datasets,including ICDAR2015,TotalText,and CTW1500.Experimental results indicate that the method increases the comprehensive F-measure of text detection by 2.3 percentage points compared to TextSnake on ICDAR2015,by 2.3 percentage points compared to SegLink++on TotalText,and by 1.2 percentage points compared to TextField on CTW1500.The visualization results show that the method can accurately locate text areas and better segment text boundaries.
关 键 词:场景文本检测 特征金字塔网络 实例分割 协同学习 深度学习
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.185.167