检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周冲浩 顾勇翔 彭程[1,2] ZHOU Chonghao;GU Yongxiang;PENG Cheng(Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610041,China;University of Chinese Academy Sciences,Beijing 100049,China)
机构地区:[1]中国科学院成都计算机应用研究所,成都610041 [2]中国科学院大学,北京100049
出 处:《计算机应用》2022年第S02期31-35,共5页journal of Computer Applications
基 金:四川省科技计划项目(2020YFG0009)。
摘 要:为改善自然场景文本检测任务中存在的分割边界粗糙和多尺度文本漏检等问题,提出了一种多尺度特征融合方法。首先,将密集连接型金字塔池化(DenseASPP)和卷积块注意力模块(CBAM)与渐进式尺度扩展网络(PSENet)进行紧密结合,前者作为尺度感知模块,可以提取丰富的多尺度信息,感知不同规模的文本;而后者作为注意力模块,能够突出多尺度信息中的关键特征,改善边界定位。然后,在骨干网络中添加空洞卷积扩大感受野。最后,在后处理阶段采用渐进式扩展算法优化文字行合成。在ICDAR2015和ICDAR2017-MLT数据集上的实验结果表明,综合评估指标F值相较于PSENet分别提升了2.47%和6.57%。可视化结果表明,该方法能够更好地分割文本边界,检测出PSENet漏检的文本。Text detection in natural scenes has problems such as rough segmentation boundaries and missed detection of multi-scale text instances.In order to solve the above problems,a new method involved with multi-scale feature fusion was proposed.Firstly,Densely connected Atrous Spatial Pyramid Pooling(DenseASPP)and Convolutional Block Attention Module(CBAM)were closely combined with Progressive Scale Expansion Network(PSENet).Rich multi-scale information was extracted and texts of different scales were perceived by the former as a scale perception module.Key features in a large amount of multi-scale information was highlighted by the latter as an attention module,and boundary positioning was improved.Then,the dilated convolution was added to to the backbone network expand the receptive field.Finally,a progressive expansion algorithm was used to optimize text line synthesis in the post-processing stage.The experimental results on the ICDAR2015 and ICDAR2017-MLT datasets show that the F-score increases by 2.47%and 6.57%respectively compared with PSENet.The visualization results show that the proposed method can better segment the text boundary and detect the text missed by PSENet.
关 键 词:文本检测 空洞卷积 注意力机制 特征金字塔 深度学习
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249