基于注意力机制藏文乌金体古籍文字识别研究

Research on Tibetan Ujin Ancient Book Character Recognition Based on Attention Mechanism

作　　者：童攀龙炳鑫拥措 TONG Pan;LONG Bing-xin;YONG Cuo(School of Information Science and Technology,Tibet University,Lhasa 850000,China;Key Laboratory of Tibetan Information Technology and Artificial Intelligence of Tibet Autonomous Region,Tibet University,Lhasa 850000,China;Engineering Research Center of Tibetan Information Technology of Ministry of Education,Tibet University,Lhasa 850000,China)

机构地区：[1]西藏大学信息科学技术学院,西藏拉萨850000 [2]西藏大学藏文信息技术人工智能西藏自治区重点实验室,西藏拉萨850000 [3]西藏大学藏文信息技术教育部工程研究中心,西藏拉萨850000

出　　处：《计算机技术与发展》2023年第10期163-168,208,共7页Computer Technology and Development

基　　金：国家重点研发计划重点专项(2017YFB1402202);西藏自治区科技创新基地自主研发项目(XZ2021HR002G)。

摘　　要：藏文乌金体古籍文字识别是古籍文字识别领域的一个难题。针对藏文乌金体古籍中存在的文字粘连和背景复杂问题,提出一种基于注意力机制的藏文乌金体古籍文字识别方法。该方法主要包含两部分,编码器部分采用卷积神经网络(CNN)与双向长短期记忆(Bi-LSTM)获得图像文本的特征序列和序列标注,解码器部分使用注意力机制计算注意力权重并与循环神经网络(RNN)相结合得出识别结果。采用实验室的616张藏文乌金体古籍作为实验数据集以及藏文字丁准确率作为实验评测指标。采用两种文字识别模型作为基线模型,从模型大小和识别率进行对比,文中识别模型在模型大小和识别效果上都优于其他两个模型,文中模型大小41.2 MB,相比基线模型中最小的优化了36 MB,字丁识别准确率90.55%,相比基线模型中最好的结果提高了7.94百分点。表明所提出的基于注意力机制的藏文乌金体古籍识别模型,显著提高了藏文乌金体古籍中的粘连文字和背景复杂图像的识别效果。The Tibetan Ujin ancient book character recognition is a difficult problem in the field of ancient book character recognition.Aiming at the problems of text adhesion and complex background in Tibetan Ujin ancient book character recognition,we propose an attention mechanism based recognition method for Tibetan Ujin ancient books,which consists of two parts.The encoder adopts the convolutional neural network(CNN)and Bi-LSTM to obtain the feature sequence and sequence annotation of image text.The decoder uses the attention mechanism to calculate the attention weight and obtains the recognition result by combining the method of recurrent neural network(RNN).616 Tibetan Ujin ancient books in the laboratory are used as the experimental data set and the accuracy rate of Tibetan characters is used as the experimental evaluation index.Two text recognition models are used as the baseline model.Compared with the model size and recognition rate,the proposed recognition model is superior to the other two models in terms of model size and recognition effect.The size of proposed recognition model is 41.2 MB,which is optimized by 36 MB compared with the smallest baseline model.The recognition accuracy of character block is 90.55%,which is 7.94%higher than the best result in the baseline model.It is showed that the proposed recognition model of Tibetan Ujin ancient books based on attention mechanism significantly improves the recognition effect of text adhesiont and complex background images in Tibetan Ujin ancient books.

关键词：藏文古籍文字识别乌金体注意力机制字丁准确率

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于注意力机制藏文乌金体古籍文字识别研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于注意力机制藏文乌金体古籍文字识别研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索