引入多级特征与通道注意力复杂场景文本检测算法  

Introducing a Complex Scene Text Detection Algorithm Incorporating Multi-Level Features and Channel Attention

在线阅读下载全文

作  者:贾小云[1] 翁佳顺 刘颜荦 JIA Xiao-yun;WENG Jia-shun;LIU Yan-luo(School of Electronic Information and Artificial Intelligence,Shaanxi University of Science&Technology,Xi’an 710021)

机构地区:[1]陕西科技大学电子信息与人工智能学院,陕西西安710021

出  处:《制造业自动化》2025年第3期127-133,共7页Manufacturing Automation

基  金:国家自然科学基金(61971272)。

摘  要:针对在多样化环境下进行文本识别时遇到的诸如文本倾斜和大小不一致等挑战,提出了一种融合注意力机制和特征整合的高效文本识别算法。首先,通过在深度卷积神经网络的特征提取阶段加入注意力机制,促进了不同层次之间的信息互动,从而减少因文本位置多样性导致的漏检情况。其次,使用空洞卷积,这种卷积具有可变感受野的特性,有助于捕捉文本区域的细节信息,并且可以在不同尺度下适应文本的变化。最后,研究通过一个特征金字塔增强机制将不同尺寸、通道和深度的特征高效地结合,并集成为最终用于分割的特征。这不仅提升了文本检测的准确性,还减少了模型的复杂性。研究结果显示,在ICDAR 2015数据集上,此改进算法的检测准确率达到88.1%,这相比当前领先的DBNet算法有所提高。此外,该算法在针对制造业场景的MPSC数据集上的检测准确率达到了90.3%,充分展示了其在处理特定领域问题时的高效性。This study proposes an efficient text recognition algorithm that addresses challenges encountered in text recognition in diverse environments,such as text skew and varying sizes.First,the algorithm incorporates an attention mechanism during the feature extraction phase of a deep convolutional neural network to facilitate interaction between different levels,thereby reducing instances of missed detections caused by text positional diversity.Second,the use of dilated convolutions with variable receptive fields helps to capture detailed information in text regions and to adapt to text variations at different scales.Finally,the research employs a feature pyramid enhancement mechanism to efficiently integrate features of different sizes,channels and depths,which are then integrated into the final features used for segmentation.This not only enhances the accuracy of text detection but also reduces the complexity of the model.The research results show that on the ICDAR 2015 dataset,this improved algorithm achieves a detection accuracy of 88.1%,an improvement over the leading DBNet algorithm currently in use.

关 键 词:文本检测 复杂场景 多级特征 通道注意力 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象