文档智能分析与识别前沿:回顾与展望  被引量:7

Frontiers of intelligent document analysis and recognition:review and prospects

在线阅读下载全文

作  者:刘成林[1,2] 金连文 白翔[4] 李晓辉 殷飞 Liu Chenglin;Jin Lianwen;Bai Xiang;Li Xiaohui;Yin Fei(State Key Laboratory of Multi-Modal Artificial Intelligence Systems,Institute of Automation,Chinese Academy of Sciences,Bejing 100190,China;School of Arificial Intelligence,University of Chinese Academy of Sciences,Bejing 100049,China;School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510641,China;School of Electronic Information and Communications,Huazhong University of Science and Technology,Wuhan 430074,China)

机构地区:[1]中国科学院自动化研究所多模态人工智能系统全国重点实验室,北京100190 [2]中国科学院大学人工智能学院,北京100049 [3]华南理工大学电子与信息学院,广州510641 [4]华中科技大学电子信息与通信学院,武汉430074

出  处:《中国图象图形学报》2023年第8期2223-2252,共30页Journal of Image and Graphics

基  金:国家自然科学基金项目(61936003,61733007,61721004);科技部“创新2030”新一代人工智能重大项目(2020AAA0109702)。

摘  要:文档分析与识别(简称文档识别)技术将各种非结构化文档数据(图像、联机笔迹)转化为结构化数据,便于计算机处理和理解,应用场景十分广阔。20世纪60年代以来,文档识别方法研究与应用受到广泛关注并取得巨大进展。得益于深度学习技术的发展和应用,文档识别的性能快速提升,相关技术在文档数字化、票据处理、笔迹录入、智能交通、文档检索与信息抽取等领域得到广泛应用。首先介绍文档识别的背景和技术范畴,回顾该领域发展历史,然后重点对深度学习方法兴起以来的研究进行综述,分析当前技术存在的不足,并建议未来值得重视的研究方向。研究现状综述部分,按文档分析与识别的几个主要技术环节(文档图像预处理、版面分析、场景文本检测、文本识别、结构化符号和图形识别、文档检索与信息抽取)分别进行介绍,简述传统方法研究的代表性工作,重点介绍深度学习方法研究的新进展。总体上,当前研究对象向深度、广度扩展,处理方法全面转向深度神经网络模型和深度学习方法,识别性能大幅提升且应用场景不断扩展。在现状分析基础上,指出当前技术在识别精度和可靠性、可解释性、学习能力和自适应性等方面还有明显不足。最后从提升性能、应用扩展、提升学习能力几个角度提出一些研究方向。从提升性能角度,研究问题包括文本识别可靠性、可解释性、全要素识别、长尾问题、多语言、复杂版面分割与理解、变形文档分析与识别等。应用扩展包括新应用(如机器人流程自动化(robotic process automation,RPA)、文字信息抄录、考古)和新技术问题(语义信息抽取、跨模态融合、面向应用的推理决策等)两方面。从提升学习能力角度,相关问题包括小样本学习、迁移学习、多任务学习、领域自适应、结构化预测、弱监督学习、自监督学习、开放�Document analysis and recognition(called document recognition in brief)is aimed to covert non-structured documents(typically,document images and online handwriting)into structured texts for facilitating computer processing and understanding.It is needed in wide applications due to the pervasive communication and usage of documents.The field of document recognition has attracted intensive attention and produced enormous progress in research and applications since 196Os.Particularly,the recent development of deep learning technology has boosted the performance of document recognition remarkably compared to traditional methods,and the technology has been applied successfully to document digitization,form processing,handwriting input,intelligent transportation,document retrieval and information extraction.In this article,we first introduce the background and involved techniques of document recognition,give an overview of the history of research(divided into four periods according to the objects of research,the methods and applications),and then review the main research progress with emphasis on deep learning based methods developed in recent years.After identifying the insufficiency of current technology,we finally suggest some important issues for future research.The review of recent progress is divided into sections corresponding to main processing steps,namely image pre-processing,layout analysis,scene text detection,text recognition,structured symbol and graphics recognition,document retrieval and information extraction.The review of recent progress is divided into sections corresponding to the main processing steps,namely image pre-processing,layout analysis,scene text detection,text recognition,structured symbol and graphics recognition,document retrieval and information extraction.1)Due to the popularity of camera-captured document images,the current main task in image pre-processing is the rectification of distorted image while the task of binarization is still concerned.Recent methods are mostly end-to-end d

关 键 词:文档分析与识别 文档智能 版面分析 文本检测 文本识别 图形符号识别 语义信息抽取 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象