检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹锦纲 张泽恩[1,2] 张铭泉 CAO Jingang;ZHANG Zeen;ZHANG Mingquan(School of Control and Computer Engineering,North China Electric Power University,Baoding 071003,China;Engineering Research Center of intelligent Computing for Complex Energy Systems Ministry of Education,Baoding 071003,China)
机构地区:[1]华北电力大学控制与计算机工程学院,河北保定071003 [2]华北电力大学复杂能源系统智能计算教育部工程研究中心,河北保定071003
出 处:《智能系统学报》2024年第6期1503-1517,共15页CAAI Transactions on Intelligent Systems
基 金:中央高校基本科研业务费专项项目(2021MS092).
摘 要:针对现有文本识别方法推理速度慢、模型参数量大的问题,提出一种改进单点定位模型(single-point scene text spotting,SPTS)的轻量级端到端文本识别方法。首先,引入PP-LCNet作为骨干网络进行特征提取;接着,在解码器之前设计三局部通道注意力模块,通过3种不同尺度的一维卷积增强通道间的信息交互;然后,提出用局部增强注意力模块替换原解码器中的前馈网络部分,通过深度可分离卷积增强文本特征空间关联性;再后,在各层解码器之后设计标记选择模块,通过显著性标记突出文本特征,减少无关像素的累积;最后,通过自回归解码方式预测出相应识别结果。将所提方法在Total-Text、CTW1500和ICDAR2015数据集上进行实验,并与6种先进方法(ABCNet、MANGO、ABCNet v2、SPTS、SwinTextSpotter和TESTR)对比。相比于SPTS方法,所提方法的推理速度分别提高了19.6、35.7、21.1 f/s,参数量减少了70.7%,证明了所提方法的有效性。Addressing the problems of slow reasoning speed and the large number of model parameters in existing text spotting methods,this paper presents a lightweight end-to-end text spotting method based on single-point scene text spotting.First,PP-LCNet was introduced as the backbone network for feature extraction.Then,a three-local channel attention module was designed before the decoder,utilizing three different scales of one-dimensional convolution to enhance information interaction between channels.Next,a locally enhanced attention module was proposed to replace the feedforward network component in the original decoder,thereby improving the spatial correlation of text features using depthwise separable convolution.Subsequently,a token selector module was added after each decoder layer to highlight text features with saliency markers and reduce the accumulation of irrelevant pixels.Finally,recognition results were predicted using an autoregressive decoding method.The proposed method was tested on three datasets,namely,Total-Text,CTW1500,and ICDAR2015,and then compared with six advanced methods(ABCNet,MANGO,ABCNet v2,SPTS,SwinTextSpotter,and TESTR).Compared to the SPTS method,the proposed method achieved increments in inference speed of 19.6,35.7,and 21.1 frames/s,respectively,and reduced the number of parameters by 70.7%,demonstrating its effectiveness.
关 键 词:注意力模块 自回归解码 轻量级网络 单点定位 文本识别 端到端 编码器 解码器
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.173