基于深度学习的模板化手写表单信息提取方法  

Template-based Handwritten Form Information Extraction Method Using Deep Learning

在线阅读下载全文

作  者:董前前 陈亮[1] 王鑫鑫 DONG Qian-qian;CHEN Liang;WANG Xin-xin(School of Computer Science,Xi’an Polytechnic University,Xi’an 710600,China)

机构地区:[1]西安工程大学计算机科学学院,陕西西安710600

出  处:《计算机技术与发展》2024年第10期204-212,共9页Computer Technology and Development

基  金:陕西省教育厅重点科学研究计划(22JS021);国家自然科学基金(51675108)。

摘  要:手写纸质表单作为制造企业各部门之间信息传递的重要数据载体,其关键信息的提取对于企业生产、管理和决策具有重要意义。然而,当前手写表单信息提取方案在复杂文本布局中很难准确快速地抽取关键信息。为解决这一问题,提出了一种两阶段的模板化手写表单信息提取方法,仅需一张图片即可完成模板搭建,聚焦于用户关心的信息,并规避了传统关系抽取任务在复杂表格中潜在的逻辑错误。首先,对于一个特定种类的表格图片,直接在图像上标注希望识别的区域,并为这些区域分配对应的key值。然后,采用高分辨率网络提高对小文本的检测精确率,并提出多分辨率均匀分割的混洗的策略使得检测模型在性能和参数都取得良好表现。同时引入时域卷积网络和自注意力机制使得识别模型能够较好地应对手写字体由于书写速度和书写工具的原因造成的字迹模糊、不清晰和笔画缺失等情况。识别完成后,系统将识别结果与预设的key值进行绑定,形成结构化的输出。实验结果表明,与典型ResNet50模型相比,在模型参数几乎相等的情况下,小文本检测准确性提升15.8百分点。文本识别任务中,模型在CASIA-HWDB2.0-2.2数据集上的字符准精确率可达99.30%。在文本框未完全涵盖整个文本行的情况下,字符准精确率仅下降0.55百分点,表明文本识别模型具有较好的鲁棒性。Handwritten paper forms serve as crucial data carriers for information exchange among various departments in manufacturing enterprises,and the extraction of key information is of great significance for production,management,and decision-making.However,current solutions for handwritten form information extraction face challenges in accurately and rapidly extracting key information from complex text layouts.To address this issue,a two-stage template-based handwritten form information extraction method is proposed,requiring only one image to complete template construction,focusing on user-relevant information,and avoiding potential logical errors in traditional relationship extraction tasks in complex tables.Initially,for a specific type of table image,the desired recognition areas are directly annotated on the image,and corresponding key values are assigned to these areas.Subsequently,a high-resolution network is employed to improve the detection precision of small text,and a strategy of uniform segmentation with shuffling at multiple resolutions is proposed to achieve good performance in both performance and parameters for the detection model.Simultaneously,the introduction of temporal convolutional networks and self-attention mechanisms enables the recognition model to better handle the blurriness,unclearness,and stroke omissions caused by handwriting speed and writing tools.After recognition,the system binds the recognition results with preset key values to form structured output.Experimental results demonstrate that compared to the typical ResNet50 model with almost equal parameters,the precision of small text detection is improved by 15.8 percentage points.In text recognition tasks,the model achieves a character precision of 99.30%on the CASIA-HWDB2.0-2.2 dataset.Even in cases where the text box does not completely cover the entire text line,the character precision only drops by 0.55 percentage points,indicating that the text recognition model exhibits good robustness.

关 键 词:信息提取 手写表单 基于模板 手写文字识别 文本行检测 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象