检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:卓天天 桑庆兵 ZHUO Tiantian;SANG Qingbing(School of Artificial Intelligence and Computer,Jiangnan University,Wuxi,Jiangsu 214122,China)
机构地区:[1]江南大学人工智能与计算机学院,江苏无锡214122
出 处:《计算机科学与探索》2022年第4期888-897,共10页Journal of Frontiers of Computer Science and Technology
基 金:江苏省自然科学基金(BK20171142)。
摘 要:将图片切分成单“字”识别再连接成“串”是脱机手写图像识别的一种方法,但由于手写字符间易存在粘连,切分方法不易实现。卷积循环神经网络(CRNN)虽解决了整张文本图片输入,标签却不易对齐的问题,但由于不同人脱机手写风格的严重差异,网络提取出的特征表示力不够。对此提出了加强型卷积块注意力模块和复合卷积,并将其加入处理脱机文本识别的CRNN+CTC主流框架中。加强型卷积块注意力模块增大输入特征图的贡献权重且并联地使用通道注意力、空间注意力,丰富了细化特征图语义信息的同时避免了通道注意力模块对空间注意力模块的权重干扰,使得网络更聚焦图片中的有用特征而非无用的拖拽字迹特征。而嵌入在网络深层的复合卷积采用的多卷积核卷积意味着不同尺度的特征融合,增强了网络的泛化性。基于加强型卷积块注意力模块和复合卷积的CRNN+CTC框架在具有语义信息的IAM数据集上准确率达到85.7748%,字符错误率为8.6%;在RIMES数据集上准确率达到92.8728%,字符错误率为3.9%,比起当前主流的脱机文本识别算法,性能进一步提升。It is a method of offline handwritten image recognition to segment a picture into a single“character”recognition and then connect it into a“string”. However, due to the adhesion between handwritten characters, the segmentation method is not easy to achieve. Although the convolutional recurrent neural network(CRNN) solves the problem that the whole text image is input, the label is not easy to align, however, due to the serious difference in offline handwriting style between different people, the feature extracted by the network is not powerful enough to represent the features. In response to this, the enhanced convolutional block attention module and composite convolution are proposed, and they are added to the CRNN+CTC mainstream framework for processing offline text recognition. The enhanced convolutional block attention module increases the contribution weight of the input feature map and uses channel attention and spatial attention in parallel to enrich the semantic information of the refined feature map, avoiding the channel attention module’s influence on the spatial attention module. It makes the network focus more on useful features in pictures rather than useless dragging handwriting features. The composite convolution embedded in the deep layer of the network adopts multi-convolution kernel convolution, which means the feature fusion of different scales and enhances the generalization of the network. The CRNN + CTC framework based on the enhanced convolutional block attention module and composite convolution achieves an accuracy rate of 85.7748% and a character error rate of 8.6% on the IAM dataset with semantic information;on the RIMES dataset,the accuracy rate is 92.8728%, and the character error rate is 3.9%. Compared with the current mainstream offline text recognition algorithms, its performance is further improved.
关 键 词:脱机英文手写单词识别 加强型卷积块注意力模块 复合卷积 卷积循环神经网络(CRNN)
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7