检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]宁夏大学物理与电子电气工程学院,宁夏回族自治区银川市750021 [2]北京交通大学电子信息工程学院,北京100044
出 处:《电脑与信息技术》2017年第6期28-32,42,共6页Computer and Information Technology
基 金:国家自然科学基金(项目编号:61162020)
摘 要:随着计算机技术的发展和在文字处理方面的应用,逐渐开展西夏文数字化的研究,西夏文的图像分割和识别对开展文物研究以及文献翻译具有非常重要的价值,但制约西夏文识别的关键技术在于西夏文数据库的建立。文章针对西夏文字的特点,设计了针对字符提取和样本数据库建立的具体流程,并讨论了数据库检索的组织和方法。依据连通域标识算法和边缘检测的原理进行字符提取,再将所提取的字符信息储存在我们指定的文本中。最终将所提取到的西夏文字与汉字进行匹配,保存后即完成一个包括西夏文字、汉字、以及它们多对应的序号等信息的样本数据库的创建,数据库的建立对西夏文字的识别提供了测试标准。With the development of computer technology and the application of word processing, the digital research ofTangut script is gradually carried out. the image segmentation and recognition of Tangut script has very important value forcarrying out cultural relics research and literature translation. but the key technology that restricts the recognition of Tangutscript is the establishment of database. According to the feature of the Tangut script. designed a specific progress ofcharacter extraction and sample database establishment. the organization and method of retrieval of database were discussedin this paper. The Tangut script image based on the principle of connection domain tagging algorithm and edge detectionfor character extraction. After the character mark is completed, then let the information of extracted character stored in ourspecified text. Finally, the extracted Tangut script characters are matched to the Chinese characters, after the preservation ofthe completion of a Tangut script text, Chinese characters, as well as their multi-corresponding serial number and otherinformation on the establishment of the sample database. The establishment of the database provides a test standard for theidentification of the Tangut script.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.74