基于关键轮廓点的汉字矢量化及字库生成研究  

Research on Chinese Character Vectorization And Font Library Generation Based on Key Contour Points

在线阅读下载全文

作  者:李旭东[1,2,3] 崔荣慧 赵彩云 LI Xudong;CUI Ronghui;ZHAO Caiyun(School of Software,Nankai University,Tianjin 300457;Advanced Computing and Key Software(Xinchuang)Haihe Laboratory,Tianjin 300459;Tianjin Key Laboratory of Operating System Enterprises,Tianjin 300457)

机构地区:[1]南开大学软件学院,天津300457 [2]先进计算与关键软件(信创)海河实验室,天津300459 [3]天津市操作系统企业重点实验室,天津300457

出  处:《软件》2024年第9期52-59,69,共9页Software

摘  要:古迹文字字符数字化保护过程中一直存在无规范流程、准确率低、耗时耗力等问题。针对这种情况,提出古迹文字矢量化流程,其中包括多阶段图像处理与矢量化文字提取两个关键技术。多阶段图像处理技术采用非局部均值去噪算法去除电子噪声和光照不均引入的噪声,运用直方图均衡化技术增强图像对比度,采用基于偏微分方程的Inpainting技术实现区域修复,完成受损字符图像修复。矢量化文字提取技术基于汉字笔画的特性提取更多类型的关键轮廓点,去除了冗余的轮廓点和噪声轮廓点。实验表明,本文提出的矢量化文字提取技术相较于现有方法,内存减少8.0419%,与原图相比误差小于0.15,最终提取到的字符存储到了Unicode字库中,可在文本编辑器中使用。In the process of digitizing and protecting the characters of historical sites,there have always been problems such as no standardized process,low accuracy,time-consuming and labor-intensive.In view of this situation,a vectorization process of historic site text was proposed,which included two key technologies:multistage image processing and vectorized text extraction.The multi-stage image processing technology uses the nonlocal mean denoising algorithm to remove the noise caused by electronic noise and uneven illumination,the histogram equalization technology is used to enhance the image contrast,and the Inpainting technology based on partial differential equations is used to realize the area restoration,and the image restoration of damaged characters is completed.Vectorized text extraction technology extracts more types of key contour points based on the characteristics of Chinese character strokes,and removes redundant contour points and noise contour points.Experiments show that compared with the existing methods,the vectorized text extraction technology proposed in this paper reduces the memory by 8.0419%,and the error is less than 0.15 compared with the original image,and the final extracted characters are stored in the Unicode font and can be used in the text editor.

关 键 词:古迹文字 去噪与修残 文字矢量化 Unicode字库 集外字处理 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象