检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:宋志平 朱亚俐[1] 徐学斌 吾尔尼沙·买买提 库尔班·吾布力[1,2] SONG Zhiping;ZHU Yali;XU Xuebin;Wuernisha Maimaiti;Kuerban Wubuli(School of Information Science and Engineering,Xinjiang University,Urumqi Xinjiang 830017,China;Xinjiang Multilingual Information Technology Key Laboratory,Urumqi Xinjiang 830017,China)
机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830017 [2]新疆多语种信息技术重点实验室,新疆乌鲁木齐830017
出 处:《新疆大学学报(自然科学版)(中英文)》2022年第3期323-330,共8页Journal of Xinjiang University(Natural Science Edition in Chinese and English)
基 金:国家自然科学基金重点项目(61862061,61563052,61363064);新疆维吾尔自治区科技厅青年基金项目(2021D01C119).
摘 要:维吾尔文文字具有粘连性大、结构不封闭等特点,这给维吾尔文关键词图像检索造成了极大的困难.为提高维吾尔文文档图像检索效率,提出一种基于灰度直方图与改进Hu不变矩的关键词图像二次检索算法,该算法对单词图像进行两次检索:粗略检索和二次检索.在粗略检索阶段,对切分后的单词图像提取灰度直方图特征并对单词数据库进行粗略匹配,在保证召回率的情况下,过滤掉部分无关单词图像形成候选单词库.在粗略匹配的基础上进行精确匹配,使用改进的Hu不变矩对关键词图像的轮廓特征进行描述,该方法在Hu不变矩中将离心率、区域矩和结构矩统一,可以有效地描述图像的轮廓信息.在包含115张纯文本维吾尔文文档图像数据库上进行实验,其检索准确率平均值为78.36%,召回率平均值为81.68%.Uygur characters have the characteristics of large adhesion and non-closed structure,which makes it very difficult for Uygur keyword image retrieval.In order to improve the efficiency of Uygur document image retrieval,a keyword image secondary retrieval algorithm based on gray histogram and improved Hu invariant moment is proposed.The algorithm retrieves word images twice:rough retrieval and secondary retrieval.In the rough retrieval stage,the gray histogram feature is extracted from the segmented word image and the word database is roughly matched,under the condition of ensuring the recall rate some irrelevant word images are filtered out to form a candidate word library.Accurate matching is carried out on the basis of rough matching and the improved Hu invariant moment is used to describe the contour features of keyword images,this method unifies the eccentricity regional moment and structural moment in Hu invariant moment,which can effectively describe the boundary information of the image.The experiment is carried out on the image database containing 115 plain text Uygur documents.The average retrieval accuracy is 78.36% and the average recall is 81.68%.
关 键 词:维吾尔文 灰度直方图 HU不变矩 粗略匹配 二次检索
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.140.195.167