基于多级文本检测的复杂文档图像扭曲矫正算法  被引量:3

Distortion Correction Algorithm for Complex Document Image Based on Multi-level Text Detection

在线阅读下载全文

作  者:寇喜超 张鸿锐 冯杰[2] 郑雅羽[1] KOU Xi-chao;ZHANG Hong-rui;FENG Jie;ZHENG Ya-yu(College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China;School of Informatics Science and Technology,Zhejiang Sci-Tech University,Hangzhou 310018,China)

机构地区:[1]浙江工业大学信息工程学院,杭州310023 [2]浙江理工大学信息学院,杭州310018

出  处:《计算机科学》2021年第12期249-255,共7页Computer Science

基  金:国家自然科学基金(61501402)。

摘  要:文档的扭曲矫正是进行文档OCR(Optical Character Recognition)的基础步骤,对提高OCR的准确率有重要作用。文档图像的扭曲矫正常常依赖于文本的提取,然而目前文档图像矫正算法大都无法对复杂文档中的文本进行准确定位和分析,导致其矫正效果不理想。针对此问题,提出了一种基于全卷积网络的文字检测框架,并使用合成文档对网络进行针对性训练,可实现对字符、词、文本行三级文本信息的准确获取,进而对文本进行自适应采样并利用三次函数对页面进行三维建模,将矫正问题转化为模型参数优化问题,达到矫正复杂文档图像的目的。使用合成扭曲文档以及真实测试数据进行矫正实验,结果表明,提出的矫正方法能够对复杂文档进行精确的文本提取,明显改善了复杂文档图像矫正后的视觉效果,相比于其他算法,该算法矫正后OCR的准确率得到显著提高。Document distortion correction is the basic step of document OCR(optical character recognition),which plays an important role in improving the accuracy of OCR.Document image distortion correction often depends on text extraction.However,most of the current document image correction algorithms cannot accurately locate and analyze the text in complex documents,resulting in unsatisfactory correction effects.To address this problem,a text detection framework based on a fully convolutional network is proposed,and the synthetic document is used to train the network to achieve accurate acquisition of three-level text information of characters,words,and text lines.A self-adaptive sampling of text and three-dimensional modeling of the page using a cubic function will transform the correction problem into a model parameter optimization problem to achieve the purpose of correcting complex document images.Correction experiments using synthetic distortion documents and real test data show that the proposed correction method can accurately extract text from complex documents,significantly improve the visual effect of complex document image correction.Compared with other algorithms,the accuracy rate of OCR after correction significantly increases.

关 键 词:卷积神经网络 文本检测 文档三维建模 文档图像矫正 光学字符识别 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象