检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨春 刘畅 方治屿 韩铮 刘成林[3] 殷绪成[1,2] Yang Chun;Liu Chang;Fang Zhiyu;Han Zheng;Liu Chenglin;Yin Xucheng(School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing 100083,China;University of Science and Technology Beijing,Pattern Recognition and Artificial Intelligence Lab,Beijing 100083,China;Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)
机构地区:[1]北京科技大学计算机与通信工程学院,北京100083 [2]北京科技大学模式识别与人工智能技术创新实验室,北京100083 [3]中国科学院自动化研究所,北京100190
出 处:《中国图象图形学报》2023年第6期1767-1791,共25页Journal of Image and Graphics
基 金:国家新一代人工智能(2030)重大项目(2020AAA0109701);国家杰出青年科学基金项目(62125601);国家自然科学基金项目(62076024,62006018)。
摘 要:开放环境下的模式识别与文字识别应用中,新数据、新模式和新类别不断涌现,要求算法具备应对新类别模式的能力。针对这一问题,研究者们开始聚焦开放集文字识别(open-set text recognition,OSTR)任务。该任务要求,算法在测试(推断)阶段,既能识别训练集见过的文字类别,还能够识别、拒识或发现训练集未见过的新文字。开放集文字识别逐步成为文字识别领域的研究热点之一。本文首先对开放集模式识别技术进行简要总结,然后重点介绍开放集文字识别的研究背景、任务定义、基本概念、研究重点和技术难点。同时,针对开放集文字识别三大问题(未知样本发现、新类别识别和上下文信息偏差),从方法的模型结构、特点优势和应用场景的角度对相关工作进行了综述。最后,对开放集文字识别技术的发展趋势和研究方向进行了分析展望。Text recognition is focused on text transcription-based image processing modeling in relevance to such domains like document digitization,content moderation,scene text translation,automation driving,scene understanding,and other related contexts.Conventional text recognition techniques are often concerned about characters-seen recognition more.However,two factors in the training set of these methods are yet to be well covered,which are novel character categories and out-of-vocabulary(OOV) samples.Newly characters-related samples are often linked with OOV-based samples.However,it may pay attention to seen characters without novel combinations or contexts.For novel character categories,internet-based environments can be mainly used to face unseen ligatures like 1) emoticons and unperceived languages,2) scene-text recognition environments,and 3) characters from foreign and region-specific languages.For digitization profiling,the undiscovered characters may not be involved in as well.Since the heterogeneity of language format to be balanced,the linguistic statistic data(e.g.,n-gram,context,etc.) can be biased the training data gradually,which is challenged for vocabulary-high-correlated text recognition methods.The two factors are required to yield three key scientific problems that affect the costs or efficiency in open-world applications.The novel characters are oriented for the novel spotting capability,whereas characters-unseen are rejected to replace silent seen characters.Furthermore,as the popular open-set recognition problem,three scientific problems can be leaked out as mentioned below.First,the emergence of novel characters is not efficient in many cases,in which re-training upon each occurrence is costly,and an incremental learning capability need to be strengthened after that.Second,an amount of attention is received as the generalized zeroshot learning text recognition task.Third,Linguistic bias robustness is yielded by the OOV samples.Due to the characterbased nature prediction,more popular methods can
关 键 词:文字识别 开放集模式识别 开放集文字识别(OSTR) 封闭集文字识别 零样本文字识别
分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.31