脱机手写体满文文本识别系统的设计与实现  被引量:7

Design and Implementation of Off-Line Handwritten Document Recognition System of Manchu Manuscript

在线阅读下载全文

作  者:赵骥[1] 李晶皎[2] 张广渊[2] 王杰[1] 

机构地区:[1]鞍山科技大学计算机科学与工程学院,鞍山114044 [2]东北大学信息科学与工程学院,沈阳110004

出  处:《模式识别与人工智能》2006年第6期801-805,共5页Pattern Recognition and Artificial Intelligence

基  金:辽宁省自然科学基金资助项目(No.2001113)

摘  要:通过研究手写体圈点满文文字特征,提出采用基于笔画序列的脱机手写满文识别方法,首先使用数字图像处理方法对识别目标实现单词提取和预处理操作,然后将处理后的单词分解为笔画基元,采用统计模式识别方法进行识别,得到笔画序列,再把笔画序列转换为字根序列,使用模糊串匹配算法实现满文罗马转写的输出,最后再采用基于隐马尔可夫模型方法对单词识别结果进行后处理,进一步提高系统识别率。实验表明,在单一字体笔画学习和大语料双词同现概率统计的基础上,系统的识别率和自适应能力都较好。Based on an off-line handwritten Manchu manuscript recognition system, a corresponding system model is established. Firstly, the digital image processing method is used to pre-process and extract the words from the identified targets. Next, the processed words are decomposed into the stroke units . The statistics pattern recognition method is employed to identify them and obtain the stroke sequence. Then the stroke sequence is converted into the root sequence. The fuzzy identification method is used to achieve the output of Manchu-Roman characters. Hidden Markov Model method is also involved to post-process the recognition results of every single word and enhance the recognition rate. The experimental results show that the recognition rate and the self-adaptability of the system are increased substantially on the basis of the single font stroke learning and probability statistics of great corpus of two-word simultaneity.

关 键 词:满文 文字识别 后处理 申匹配 罗马转写体 隐马尔可夫模型(HMM) 

分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象