基于Rcnn+Char_SegNet的藏文乌梅长文本识别  

Tibetan Cursive Script Long Text Printed Recognition Based on Rcnn+Char_SegNet

在线阅读下载全文

作  者:才让当知 黄鹤鸣[1,2,3] 李鑫元 张会云 CAIRANG Dangzhi;HUANG Heming;LI Xinyuan;ZHANG Huiyun(School of Computer Science and Technology,Qinghai Normal University,Xining,Qinghai 810008,China;State Key Laboratory of Tibetan Intelligent Information Processing and Application Co-established by the Science and Technology and Qinghai Province,Qinghai Normal University,Xining,Qinghai 810008,China;Key Laboratory of Tibetan Information Processing,Ministry of Education,Qinghai Normal University,Xining,Qinghai 810008,China)

机构地区:[1]青海师范大学计算机学院,青海西宁810008 [2]青海师范大学省部共建藏语智能信息处理及应用国家重点实验室,青海西宁810008 [3]青海师范大学藏文信息处理教育部重点实验室,青海西宁810008

出  处:《中文信息学报》2023年第12期62-69,75,共9页Journal of Chinese Information Processing

基  金:青海省科技计划项目(2017-GX-146);国家自然科学基金(62066039,62166034)。

摘  要:藏文文字识别在藏文古籍文献、藏文办公自动化以及藏汉双语教育等领域具有非常重要的应用价值。作为两种常见的藏文字体之一,乌梅字体中笔画粘连和交错现象严重,导致识别难度较大。为此,该文提出了基于Rcnn+Char_SegNet的藏文乌梅长文本识别。首先,在CNN的每个卷积层中添加循环连接,增强CNN提取乌梅字粘连片段的特征和集成上下文信息的能力;其次,对提取的图像文本特征序列采用BiLSTM进行建模;最后,采用字丁切分模块增强CTC对图像序列和标签对齐的监督能力。在自行构建的Cursive Script-C517测试数据集上,该模型的最高准确率和平均准确率分别达到了99.80%和91.43%,分别比基线提高了1.45和48.47个百分点。此外,通过字符级词典库训练,使模型的训练时间减少了13.63%。实验表明,该方法有效解决了乌梅字体中笔画粘连和交错现象严重导致的识别错误问题,显著提升了印刷体藏文乌梅识别精度,减少了训练时间,且具有较好的鲁棒性。As one of the two common Tibetan fonts,Cursive Script font has serious stroke adhesion and interleaving,resulting in great difficulty in OCR.This paper proposes a method to recognize Cursive Script long text based on Rcnn+Char_SegNet.Firstly,recurrent connections are added to each layer of CNN to extract the features of Cursive Script word adhesion fragments and capture context information.Secondly,the extracted image text feature sequence is modeled by Bi-LSTM.Finally,the character segmentation module is used to enhance the ability of CTC module to supervise the image sequence and label alignment.On the self-constructed Cursive Script-C517 test database,the highest accuracy and average accuracy of the proposed model reach 99.80%and 91.43%,respectively,which are 1.45 and 48.47 percentage points higher than the baseline,respectively.

关 键 词:循环卷积神经网络 印刷体藏文识别 图像序列识别 印刷体藏文乌梅识别 藏文字丁切分 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术] H214[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象