检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:阿依萨代提.阿卜力孜 加合买提.司马义 卡米力.木依丁 艾斯卡尔.艾木都拉[1] AYSADET·Abliz;HOJAHMAT·Ismayil;KAMIL·Muyidin;ASKAR·Hamdulla(Institute of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)
机构地区:[1]新疆大学信息科学与工程学院,乌鲁木齐830046
出 处:《计算机工程与应用》2018年第9期133-138,共6页Computer Engineering and Applications
基 金:国家自然科学基金(No.61462080)
摘 要:针对脱机手写维吾尔文本行图像中单词切分问题,提出了FCM融合K-means的聚类算法。通过该算法得到单词内距离和单词间距离两种分类。以聚类结果为依据,对文字区域进行合并,得到切分点,再对切分点内的文字进行连通域标注,进行着色处理。以50幅不同的人书写的维吾尔脱机手写文本图像为实验对象,共有536行和4 002个单词,正确切分率达到80.68%。实验结果表明,该方法解决了手写维吾尔文在切分过程中,单词间距离不规律带来的切分困难的问题和一些单词间重叠的问题。同时实现了大篇幅手写文本图像的整体处理。For the problem of word extraction from handwritten Uyghur text lines,this paper proposes a clustering algorithm based on FCM fusion K-means.Through the clustering,two classification can be obtained for within word distance and between word distance.Based on clustering results,merging the connected components to get the segmented points.At the same time for the connected components which are within the segmented points used connected components labeling and coloring.In this paper,experimental object is 50 pairs of Uyghur off-line handwritten text images that are written different people and there are 536 lines and 4,002 words,correct segmentation rate reaches 80.68%.Experimental results show that the proposed method solve the problem which is difficult to extract words from the text line because of irregular distance between the words and overlapping between adjacent words.Meanwhile the presented method achieves whole dispose to the large handwritten text image.
关 键 词:维吾尔文 手写文本图像 单词切分 聚类 着色处理
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7