表格型文档自动识别系统及其应用  被引量:2

Automatic Recognition System of Tabular Document and Its Application

在线阅读下载全文

作  者:张艳[1] 郁生阳[2] 张重阳[2] 娄震[2] 杨静宇[2] 

机构地区:[1]公安部第三研究所,上海200031 [2]南京理工大学计算机科学与技术学院,南京210094

出  处:《系统仿真学报》2009年第10期2916-2920,共5页Journal of System Simulation

基  金:国家自然科学基金(60632050;60503026);863计划(2006AA01Z119)

摘  要:随着文档影像系统的广泛应用,文档图像自动处理已成为当前的一个研究热点。对表格型文档自动识别系统中的若干关键技术进行了研究。首先,在版面分析中,提出了基于框线检测的文档分类方法;其次,根据表格型文档图像的特点,介绍了相应的识别域提取、框线去除以及手写字符串分割方法;最后,在手写数字识别部分,设计了一种基于形状上下文特征和梯度特征的组合识别方法。最后将该系统应用于银行票据小写金额识别,通过真实表格型票据进行仿真实验,证明了系统的有效性,系统识别率达到了实用的水平。With the widely use of document image system, the automatic processing of document images has become a hot topic nowadays. Several pivotal techniques of the form document auto-processing system were emphatically discussed. Firstly, a document image classification method was adopted based on frame line detection in layout analysis. Secondly, corresponding algorithms were proposed on the basis of the characteristic of form document image, such as the pick-up of identification regions, frame line detection and removal and segmentation of handwritten character string. Finally, a combined recognition method based on shape context feature and gradient feature was designed during the part of handwritten digit recognition. The results of emulational experiment on real financial bill images illustrate the validity and practicability of the system.

关 键 词:表格型文档 框线检测 框线去除 文档图像分析 手写数字识别 

分 类 号:TP391.13[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象