藏文印刷体字符识别技术研究  被引量:10

Study on printed Tibetan character recognition technology

在线阅读下载全文

作  者:李永忠[1] 王玉雷[1] 刘真真[1] 

机构地区:[1]江苏科技大学计算机科学与工程学院,镇江212003

出  处:《南京大学学报(自然科学版)》2012年第1期55-62,共8页Journal of Nanjing University(Natural Science)

基  金:国家自然科学基金(69973038);江苏省高校自然科学基金(05KJD52006);江苏科技大学科研资助项目(2005DX006J)

摘  要:在分析了现有的藏文字符特征提取方法-图像投影法和方向线素法的基础上,运用分形矩理论和粗网格法,实现了基于分形矩的藏文字符特征提取方法和改进粗网格法藏文字符特征提取.用分形矩方法提取的特征有效地反映了藏文字丁的局部和全局特征,减少了图像中因像素位置变化而降低识别率的影响.用改进粗网格法提取的字符特征不仅能有效地减少因图像像素位置变化造成的识别率下降的影响,而且在一定程度上克服了藏文字符过多而造成的误识别率过高的缺点.通过实验对比,分形矩和改进粗网格法与方向线素特征提取方法的在识别率相同情况下,运算速度快,且在一定程度上克服了藏文字丁极多而造成的误识率高的缺点.Tibetan character set is composed of 30 Tibetan letters,4 Tibetan vowel signs,4 Tibetan subscripts,3 Tibetan superscripts,10 Tibetan digits and some Tibetan punctuation markers.All Tibetan regards syllable as the word-building unit.Each syllable is separated by the syllable nodes and every horizontal unit of the syllable spelling is called a Tibetan character.It is the least one that the Tibetan character of a syllable counts,there can be 4 at most.Tibetan character recognition takes Tibetan character as the basic recognition unit,so the difficulty of the printed Tibetan character recognition is how to select and extract appropriate features to represent a Tibetan character.At present,the printed Tibetan character feature extraction mainly has two methods,namely image projection approach and directional line element approach.The image projection approach obtains the character characteristic vector from the character image pixels along a certain direction(such as vertical,horizontal or diagonal direction,etc.) projection.This method is characterized by that the algorithms of matching and classification are simple,easy to realize,and the anti-interference ability is strong,but its ability to distinguish similitude characters is bad.The extraction method of directional line element feature is the edge of the character pixel by four directions: horizontal 0°,vertical 90°,inclined 45° and anti-oblique 135° to quantify and the quantization result is regarded as the direction attribute of the point.This approach not only includes the structure information of characters,but also has the statistical property and its recognition performance is better than the image projection approach.But its feature vector dimension is too much,so use this method needs to take some compression algorithms,which makes this method described comparatively complicated while identifying,the matching process complexity much higher. Approaches of feature extraction for printed Tibetan character both based on fractal moments and based on the i

关 键 词:藏文字符识别 分形矩 特征提取 粗网格 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象