手写中文信封的地址行字符切分算法  被引量:3

An Offline Handwritten Character Segmentation Algorithm for Mail Address

在线阅读下载全文

作  者:韩智[1] 刘昌平[1] 殷绪成[1] 

机构地区:[1]中国科学院自动化研究所文字识别工程中心

出  处:《中文信息学报》2006年第1期85-90,共6页Journal of Chinese Information Processing

基  金:国家863计划资助项目(2001AA114130)

摘  要:在手写体中文信封处理系统中,地址行字符切分是实现地址行识别的关键步骤。本文根据邮政信封地址行字符的特点,有针对性的提出了一种字符切分算法。首先对地址行图像利用投影、求连通区域、笔划穿越数分析等基于字符结构的方法进行初始切分,得到基本字段序列;然后通过对相邻的基本字段进行组合形成多条候选切分路径,再通过识别的可信度和邮政目标地址库的先验知识信息对路径进行评价分析,从而得到最优的切分路径。该算法经过邮政分拣机采集的实际信封图像测试,纯地址行识别正确率达到78.61%,地址行识别与邮政编码识别相结合的分拣正确率达到95.42%。Character segmentation for mail address has become a crucial step for the address recognition in the automatic post mail sorting system. In this paper, a character segmentation algorithm was proposed according to the characteristics of handwritten mail address character. First a simple segmentation process was filfilled using the structure-based methods, including vertical projection, connected components extraction and stroke cross number analysis, to extract the block sequence from the mail address image. Next candidate segmentation paths were created by merging the neighboring blocks. Then these paths were evaluated by the character recognition confidence and knowledge analysis of the known post address database. An experiment with the algorithm was carried out on more than 500 real envelop images, with the correct sorting rate of address recognition up to 78.61% and the rate of address and posteode integrated recognition up to 95.42%.

关 键 词:人工智能 模式识别 邮政信封地址 脱机手写体汉字 字符切分 OCR 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象