汉字字形计算及其在校对系统中的应用  被引量:5

Similarity Calculation of Chinese Character Glyph and its Application in Computer Aided Proofreading System

在线阅读下载全文

作  者:宋柔[1,2] 林民[1,3] 葛诗利[2] 

机构地区:[1]北京工业大学计算机学院,北京100022 [2]北京语言大学信息科学学院,北京100083 [3]内蒙古师范大学计算机与信息工程学院,内蒙古呼和浩特010022

出  处:《小型微型计算机系统》2008年第10期1964-1968,共5页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(60572159)资助

摘  要:汉语是一种开放大字符集语言,汉字字形相似度计算是汉语信息处理的一项基础研究,对于汉字识别、计算机辅助的汉语文章校对和汉字教学都有重要作用.本文对现有汉字字形结构描述方法从图形相似角度进行了改进,并给出了一种基于结构描述的字形相似度计算算法,该方法计算相似度无需字形样本实例的学习训练,对于常用字和难于获取书写样本的生僻字的相似度计算,都具有很好的适应性,可满足不断扩大的汉字集合计算相似度的需要.实验表明,采用此法计算得到的GB2312中6763个汉字的相似字表,与人的认知结果吻合度很好,并应用于计算机辅助校对系统中的别字修改提示,显示出较好效果.Chinese is a language with a large open character set. The similarity calculation of Chinese character glyph is important in automatic recognizing, computer aided proofreading, and teaching of Chinese characters. The current description method of Chinese character glyph structure is improved from the perspective of graphical similarity and an algorithm for the calculation of glyph similarity of Chinese characters without training samples of character glyphs is proposed. This algorithm possesses a better adaptability for character similarity calculation of both common-used Chinese characters and rarely-used ones whose writing samples are difficult to obtain, so as to meet the demand for character similarity calculations on the continually expanding Chinese character set. Experiment indicates that the similar character lists of 6763 characters in GB2312 calculated by this algorithm have a high coincidence with human perception. The application of the similar character lists in a computer aided proofreading system improves the modification guide for users.

关 键 词:汉字字形 结构描述 相似度 

分 类 号:TP391.12[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象