汉字识别中图特征提取方法  

Graph Feature Extraction Method in Chinese Character Recognition

在线阅读下载全文

作  者:唐善成[1] 梁少君 戴风华 来坤 曹瑶倩 TANG Shan-cheng;LIANG Shao-jun;DAI Feng-hua;LAI Kun;CAO Yao-qian(Communication and Information Engineering,Xi'an University of Science and Technology,Xi'an 710054,China;CCCC Second Highway Engineering Bureau Co.,Ltd.,Xi'an 710054,China)

机构地区:[1]西安科技大学通信与信息工程学院,西安710054 [2]中交第二公路工程局有限公司,西安710065

出  处:《科学技术与工程》2024年第2期658-664,共7页Science Technology and Engineering

基  金:国家重点研发计划(2018YFC0808300);陕西省科技计划重点产业创新链(群)项目(2020ZDLGY15-07);西安市科技计划科技创新引导项目(201805036YD14CG20(4))。

摘  要:为解决图像像素表示汉字特征方法不能有效表示汉字本质特征、空间复杂度较高的问题,提出了一种汉字图特征提取方法。方法主要包含汉字图像二值化,汉字图像骨架提取,汉字图特征提取3个部分;二值化消除图像中的噪声,提高图特征提取的准确度;骨架提取保留图像中重要的像素点,剔除无关的像素点;图特征提取将汉字关键点与图数据结构结合来表示汉字形状特征。在3 908个常用汉字的5种字体上进行实验。结果表明,该方法能够正确提取笔画复杂汉字的图特征,有效表示汉字本质特征;不同字体汉字图特征相同的汉字数量最高为3 195个,方法表现较稳定;平均每个汉字可以用22.6个图节点、19.1个边表示,相较于用单通道图像表示汉字特征,可大幅降低空间复杂度。In order to solve the problem that the method of representing Chinese character features by image pixels cannot effectively represent the essential features of Chinese characters and has high space complexity,a feature extraction method for Chinese character images was proposed.The method mainly includes three parts:binarization of Chinese character image,skeleton extraction of Chinese character image,and feature extraction of Chinese character image.Binarization eliminates noise in the image and improves the accuracy of image feature extraction.Skeleton extraction retains important pixels in the image,eliminates Irrelevant pixels.Graph feature extraction combines Chinese character key points with graph data structures to represent Chinese character shape features.Experiments were carried out on five fonts of 3908 commonly used Chinese characters.The results show that the method can correctly extract the graph features of Chinese characters with complex strokes and effectively represent the essential features of Chinese characters.The maximum number of Chinese characters with the same graph features of different fonts is 3195,and the performance of the method is relatively stable.An average of 22.6 graph nodes can be used for each Chinese character,19.1 edge representations,compared to using single-channel images to represent Chinese character features,can greatly reduce the space complexity.

关 键 词:汉字识别 图特征 图数据结构 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象