检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北京工业大学计算机学院,北京100022 [2]北京语言大学信息科学学院,北京100083 [3]内蒙古师范大学计算机与信息工程学院,内蒙古呼和浩特010022
出 处:《小型微型计算机系统》2008年第10期1964-1968,共5页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(60572159)资助
摘 要:汉语是一种开放大字符集语言,汉字字形相似度计算是汉语信息处理的一项基础研究,对于汉字识别、计算机辅助的汉语文章校对和汉字教学都有重要作用.本文对现有汉字字形结构描述方法从图形相似角度进行了改进,并给出了一种基于结构描述的字形相似度计算算法,该方法计算相似度无需字形样本实例的学习训练,对于常用字和难于获取书写样本的生僻字的相似度计算,都具有很好的适应性,可满足不断扩大的汉字集合计算相似度的需要.实验表明,采用此法计算得到的GB2312中6763个汉字的相似字表,与人的认知结果吻合度很好,并应用于计算机辅助校对系统中的别字修改提示,显示出较好效果.Chinese is a language with a large open character set. The similarity calculation of Chinese character glyph is important in automatic recognizing, computer aided proofreading, and teaching of Chinese characters. The current description method of Chinese character glyph structure is improved from the perspective of graphical similarity and an algorithm for the calculation of glyph similarity of Chinese characters without training samples of character glyphs is proposed. This algorithm possesses a better adaptability for character similarity calculation of both common-used Chinese characters and rarely-used ones whose writing samples are difficult to obtain, so as to meet the demand for character similarity calculations on the continually expanding Chinese character set. Experiment indicates that the similar character lists of 6763 characters in GB2312 calculated by this algorithm have a high coincidence with human perception. The application of the similar character lists in a computer aided proofreading system improves the modification guide for users.
分 类 号:TP391.12[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.116.237.222