检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:马静[1] 薛浩[1] 郭小宇 MA Jing;XUE Hao;GUO Xiaoyu(College of Economics and Management,Nanjing University of Aeronautics and)
机构地区:[1]南京航空航天大学经济与管理学院,南京211106
出 处:《计算机工程与应用》2023年第21期112-122,共11页Computer Engineering and Applications
基 金:国家自然科学基金面上项目(72174086);中央高校基本科研业务费专项前瞻性发展策略研究资助项目(NW2020001)。
摘 要:文本检测是文本识别的前提和基础。复杂自然场景下,受透视、遮挡、变形等因素影响,图像质量难以保证,同时图像中的文字形式丰富多样,多呈不规则形状,加上复杂背景的干扰,致使文本检测难度大、精确度低。针对文本形状不规则的场景,提出了一种文本边轨模型(TextRail),该模型基于文本上、下边界基准点表示文本区域的思想,实现对任意形状文本的高效检测。TextRail使用全卷积网络(full convolutional network,FCN)及特征金字塔网络(feature pyramid network,FPN)提取文本图像特征;将特征送入检测头网络,实现文本区域上下边界基准点的预测,将预测结果通过位置感知非极大抑制(locality-aware non-maximum suppression,LNMS)合并,得到最终的上下边界基准点;采用薄板样条插值(thin plate spline,TPS)的方法实现对不规则文本的自动矫正。通过大量的实验验证,TextRail在F1分值上优于其他文本检测模型。同时TextRail模型可以准确表示出文字的朝向、弯曲和变形情况,有效提升了不规则文本检测的准确率和鲁棒性。Text detection is a prerequisite for text recognition.In complicated natural scenarios,texts may be distorted,bent or in irregular shapes,the image is in poor quality.The texts that existed in images have abundant style,irregular shapes,and complex backgrounds,which make detection incorrect and have low recognition accuracy.For irregular text,an upper and lower fiducial point is used as the basis to construct the TextRail model,aiming to detect texts with any shapes effectively.Firstly,full convolutional network(FCN)and feature pyramid network(FPN)are used to extract features from images.Secondly,these features are sent to detection heads in order to predict the upper and lower boundaries of the text area.Locality-aware non-maximum suppression(LNMS)network is used to obtain the final fiducial points prediction.Finally,thin plate spline(TPS)is utilized to rectify bent texts based on the fiducial points.The results of exper-iments show that the F1-score of the proposed model is the best among all the models.This method can represent text orientation,bend,and distortion.Therefore,the proposed method can significantly increase the recognition accuracy and robustness when texts are bent or in different orientations.
关 键 词:复杂自然场景 不规则文本检测 文本矫正 基准点 TextRail模型
分 类 号:TP751.2[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249