基于深度学习的自然场景藏文识别研究  被引量:9

Study on Tibetan text recognition in natural scenes based on deep learning

在线阅读下载全文

作  者:仁青东主 尼玛扎西 Rinchen-Dhondub;Nyima-Tashi(School of Information science and technology,Tibet University,Lhasa 850000)

机构地区:[1]西藏大学信息科学技术学院

出  处:《高原科学研究》2019年第4期96-103,共8页Plateau Science Research

基  金:国家重点研发计划重点专项(2017YFB1402200);2019年度西藏大学校级培育项目(ZDCZJH19-19)

摘  要:自然场景文字识别已成为计算机视觉领域中的重要研究领域,但是当前大多数技术方法都集中在中文和英文的识别上,对于自然场景中的藏文识别研究少之又少。文章针对自然场景中的复杂图像质量、文字粘连的识别问题,提出了一种符合藏文的卷积循环神经网络CRNN(Convolutional Recurrent Neural Network)与连接时域分类CTC(Connectionist Temporal Classification)相结合的自然场景藏文识别模型,采用基于滑动窗的行识别技术,解决行文字较长的粘连文字识别问题;采用二维串识别技术,即横向以字符为单位的串识别核心与纵向以字母为单位的串识别核心,分别用来识别以现代藏文字符为主的高频字符和以梵文藏文转写字为主的低频字符。这些技术方法对自然场景藏文识别效果有显著提升,通过对600个样本进行测试得出平均准确率为93.24%。Study on text or character recognition in natural scenes has become an important subject in the field of computer vision.Most of current studies have been mainly focused on the recognition of Chinese and English text or character,however,study on the Tibetan text or character recognition in natural scenes is rare.In this paper,a traditional Tibetan Optical Character Recognition(OCR)technology and text recognition technology based on deep learning were studied.Aiming at solving the problem of complex image quality and text adhesion in natural scenes,a Tibetan text recognition model with a combination of CRNN(Convolutional Recurrent Neural Network)and CTC(Connectionist Temporal Classification)is proposed.In this model,the sliding window-based line recognition technology is applied to solve the problem of long-lasting sticky text recognition of the line characters,and the two-dimensional string recognition technology,i.e.the horizontal character-based string recognition core and the vertical letter-based string recognition core,is used to identify high-frequency characters based on modern Tibetan characters and low-frequency characters based on Sanskrit Tibetan text,respectively.These technical methods have significantly improved the capability of Tibetan characters recognition in natural scenes and the average accuracy rate was 93.24%for 600 tested samples.

关 键 词:藏文 藏文识别 自然场景 深度学习 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象