基于识别的粘连手写数字串切分系统  被引量:6

Recognition-based system for segmentation of handwritten numeral strings

在线阅读下载全文

作  者:雷云[1] 刘长松[1] 丁晓青[1] 付强[1] 

机构地区:[1]清华大学电子工程系,北京100084

出  处:《清华大学学报(自然科学版)》2005年第4期433-436,共4页Journal of Tsinghua University(Science and Technology)

基  金:国家自然科学基金资助项目(60241005)

摘  要:为解决手写数字串中的粘连问题,提出了一种基于识别的粘连手写数字串切分系统。该系统通过外轮廓分析和投影分析,找出相应的候选切分线。利用候选切分线对数字串过切分,过切分后的每个子图像定义为一个片段,相邻的一个或多个片段的组合定义为一个集团。数字串的每个候选切分结果由一个或多个集团组成。对所有的候选切分结果建立一个概率模型,并使用单个数字识别器对所有集团进行识别。根据最大后验概率准则,选出最优的切分结果。在搜索最优切分结果时,使用剪枝算法,降低了算法的时间和空间复杂度,从而满足实时处理要求。利用从NISTSD19中收集到的样本进行实验,正确切分率高达97.72%。A recognition-based system was developed for segmentation of handwritten numeral strings to deal with the recognition problems when the numerals touch each other. External contour analysis and projection analysis are combined to locate the candidate segmentation lines. These candidate segmentation lines are then used to over-segment the numeral string. Each subimage of the over-segmented string is defined as a fragment. The combination of one or more adjacent fragments is defined as a clique. Thus, each candidate segmentation result is composed of one or more cliques. Then, all the candidate segmentation results are described in a probabilistic model with a single-digit classifier used to recognize each clique. Finally, maximum a posterior (MAP) criterion is used to select the optimal segmentation result from all candidate segmentation results. The search for the optimal result uses a pruning algorithm to reduce the time and space complexity for real-time applications. Test results on collections of samples from NIST SD19 show that the system achieve an accuracy of 97.72%.

关 键 词:文字识别 手写 切分 轮廓 最大后验 

分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象