检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:高定国 李婧怡[1,2] 索朗曲珍 Gao Dingguo;Li Jingyi;Suoang-Quzhen(School of Information Science and Technology,Tibet University,Lhasa 850000,China;Tibetan Information Technology Innovative Talent Cultivation Demonstration Base,Tibet University,Lhasa 850000,China)
机构地区:[1]西藏大学信息科学技术学院,西藏拉萨850000 [2]西藏大学藏文信息技术创新人才培养示范基地,西藏拉萨850000
出 处:《高原科学研究》2024年第1期112-120,共9页Plateau Science Research
基 金:国家自然科学基金项目(62166038);四川省科技计划项目(2023YFQ0044)。
摘 要:敦煌藏文文献是研究唐代吐蕃社会历史的珍贵文献。目前在敦煌藏文文献数字化研究方面,由于文献年代久远、书写载体低劣、保存条件差等各方面的原因使得文档图像背景杂乱、文字模糊并残缺不全,严重影响了文本识别系统的准确性和鲁棒性。为了研究低质古籍文献图像的预处理对文字识别的影响,文章以古籍文献图像质量极差的敦煌藏文文献作为研究对象,分别采用对数变换、伽马变换、中值滤波变换、高斯滤波处理和PS人工批处理等传统方法,及全局阈值、自适应阈值和自定义阈值的二值化、基于神经网络ViT的图像增强方法对图像进行增强。对比实验表明,低质古籍图像预处理对文字识别率提升影响不大,但高斯滤波处理、自定义阈值的图像二值化和基于神经网络的图像数据增强对识别率提升有一定的促进作用。Dunhuang Tibetan literature is a precious document for the study of the social history of Tubo in the Tang Dynasty.At present,in the digital research of Dunhuang Tibetan literature,due to the age of the document,the document writing carrier,preservation conditions and other aspects of the reasons make the document image background messy,text fuzzy and incomplete,which seriously affects the accuracy and robustness of the text rec-ognition system.In order to study the influence of image preprocessing of low-quality ancient books on character recognition,this paper takes the Dunhuang Tibetan documents with extremely poor image quality as the research object and uses traditional methods such as logarithmic transformation,gamma transform,median filter trans-form,Gaussian filter processing,and PS manual batch processing to enhance the images,and adopts the binariza-tion of global threshold,adaptive threshold and custom threshold,and image enhancement based on neural net-work ViT.Comparative experiments show that the preprocessing of low-quality ancient book images has little im-pact on the improvement of the recognition rate,however,Gaussian filtering processing,custom threshold image binarization,and neural network-based image data enhancement have a certain effect on the improvement of the recognition rate.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44