检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:梁添才 刘建平 罗攀峰[1,3,2]
机构地区:[1]广州广电运通金融电子股份有限公司广电运通研究院,广东广州510663 [2]广东省货币识别企业重点实验室,广东广州510663 [3]清华大学机械工程学院,北京100084
出 处:《计算机仿真》2015年第9期276-280,共5页Computer Simulation
摘 要:图像二值化是将文本从背景中分离的算法,是后继识别任务的关键预处理步骤。低质量的文本图像往往包含各种退化情况,比如:光照不均匀、笔画灰度变化过大、背面渗透等。这些退化情况导致其二值化非常困难。近几年出现的基于Laplacian能量的二值化方法对退化文档进行二值化取得较好的结果,但是上述方法容易导致细长弱笔画丢失。为此提出一种改进Laplacian能量的方法,利用笔画有较强的双边缘响应,对笔画区域的Laplacian算子响应进行加强,使得细长弱笔画得以保留。针对DIBCO2013数据测试表明本文的方法能够较好的处理细长弱笔画的二值化问题。The main function of document image binarization algorithm is to extract text from background of im- age. Binarization is a key pre-proeessing of document automatic processing system. Extraction of text from badly de- graded document images is a very challenging task due to bad illumination, bleed though and the high inter/intra- variation between the document background and the foreground text of different document images. The recent algo- rithm which based on the Laplacian energy has achieved a good performance on the degraded document images, but the main drawback of this algorithm is that the thin long and weak strokes in the degraded document images can not he handled properly. In this paper, a modified Laplacian energy is proposed, which is based on the observation that the strokes have the relatively strong response of double edge. The thin long and weak strokes in the degraded document images can be segmented properly via the combination of the Laplacian with the double edge response of the image in- tensity. The experiments on the DIBC02013 dataset show the superior performance of our proposed method on the ex- traction of the thin long and weak strokes, compared with other techniques.
关 键 词:图像二值化 退化文本图像 拉普拉斯算子 图割算法
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.143.211.215