文本数字化图像OCR识别的准确度测度实验与提高  被引量:10

Text Digital Image OCR Accuracy Measurement Experiment and Improvement

在线阅读下载全文

作  者:臧国全[1] 

机构地区:[1]郑州大学信息管理系,郑州450001

出  处:《图书情报知识》2010年第3期62-67,共6页Documentation,Information & Knowledge

基  金:河南省高校科技创新人才支持计划(2008-551)资助

摘  要:基于英国国家图书馆的Reshelp和Burney两个古旧英文报纸数字化项目,作者对文本型数字图像的OCR识别的准确度进行测试实验,结果显示整体准确度不高,且从高到低依次为字符、单词、重要单词、大写字母开头的重要单词。然后,将OCR识别周期划分为数字扫描对象的获取、数字图像的生产、数字图像的处理和文本识别等四个阶段,分析每个阶段影响准确度的因素,探讨提高准确度的具体措施。The following two aspects are discussed in this paper: ( 1 ) based on Reshelp and Burney historic English newspaper digitization projects in British Library, the author does an experiment on OCR accuracy measurement, and the result shows that the overall accuracies are not very good, and the sequence from high to low is characters, words, significant words and words start with capital letter; (2) based on the four stages of OCR period which are digital scanning object obtainment, digital image production, digital image process and text recognition, the author analyses the accuracy influencing factors and discusses the measures for improving the accuracy.

关 键 词:OCR识别 准确度测试 信息资源数字化 

分 类 号:G250[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象