检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:连哲 殷雁君 米增 智敏 徐巧枝 LIAN Zhe;YIN Yanjun;MI Zeng;ZHI Min;XU Qiaozhi(School of Computer Science and Technology,Inner Mongolia Normal University,Hohhot 010022,China)
机构地区:[1]内蒙古师范大学计算机科学技术学院,呼和浩特010022
出 处:《计算机工程与应用》2025年第5期250-260,共11页Computer Engineering and Applications
基 金:内蒙古自治区自然科学基金(2021LHMS06009,2021MS06031);内蒙古自治区高等学校科学研究项目(NJZZ21004);内蒙古师范大学基本科研业务费专项资金项目(2022JBZHO13)。
摘 要:场景文本检测是图像处理领域的基础性研究工作,具有广泛的应用价值。DBNet作为该领域具有代表性的算法,重构文本实例的后处理过程过于简单,对纵横比显著变化的文本容易误检以及对小文本容易漏检。为解决以上问题,设计并提出用于场景文本检测的非对称迭代细化预测网络AIRPNet。模型基于ResNet50特征提取网络,将常规卷积替换为可变形卷积以适应不规则文本特征,并调整block堆叠数使得各层携带的特征更加合理。采用RFP的递归思想更充分地融合多层特征,设计非对称迭代细化预测模块构建更为准确的概率图,可微分二值化后处理重构文本实例边界。针对非对称迭代细化预测模块,设计多种结构进行探究。为评估提出模型的有效性,在三个数据集上与最先进的主流模型进行对比,在ICDAR2015、MSRA-TD500和CTW1500数据集中,分别取得88.7%、88.4%和84.9%的F-measure,实现或接近SOTA性能。实验结果表明,提出模型能够捕获较为准确的概率图,从而构建较为完整的文本边界框。Scene text detection is a fundamental research work in the field of image processing,which has a wide range of application value.As a representative algorithm in this field,DBNet has a problem that the post-processing of reconstructed text instances is too simple,and it is easy to misdetect the text with a significant change in aspect ratio as well as easy to miss the detection of small text.In order to solve the above problems,AIRPNet,an asymmetric iterative refinement prediction network for scene text detection,is designed and proposed.The model is based on ResNet50 feature extraction network,which replaces the regular convolution with deformable convolution to adapt to the irregular text features and adjusts the number of block stacks to make the features carried by each layer more reasonable.The recursive idea of RFP is used to integrate the multi-layer features more fully,and the asymmetric iterative refinement prediction module is designed to construct more accurate probability maps,and the text instance boundaries are reconstructed by differentiable binarization post-processing.For the asymmetric iterative refinement prediction module,various structures are designed for exploration.To evaluate the effectiveness of the proposed model,it is compared with the state-of-the-art mainstream models on three datasets,and 88.7%,88.4%,and 84.9%of F-measure is achieved in the ICDAR2015,MSRA-TD500,and CTW1500 datasets,respectively,realizing or approaching the SOTA performance.The experimental results show that the proposed model is able to capture more accurate probability maps and thus construct more complete text bounding boxes.
关 键 词:文本检测 递归金字塔 非对称卷积 迭代细化预测 可微分二值化
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49