检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中国社会科学院民族学与人类学研究所计算语言学重点实验室,北京100081 [2]北京理工大学自动控制系,北京100081
出 处:《计算机学报》2004年第4期524-529,共6页Chinese Journal of Computers
基 金:国家自然科学基金 ( 60 173 0 2 4)资助
摘 要:针对中国国家标准及ISO藏文编码字符集提出书面藏语字词的排序涉及藏字结构序、构造级和字符序概念 ,是不同于中文、英文序性而性质独特的一种排序 .文章详尽分析了藏字字形、结构形态、传统字符顺序以及藏字字长和层高等特征 ,构建出藏语排序的数学模型 .然后依据模型要求为每类藏文符号进行数字赋值 ,通过算法逐步确定字符位置并识别字符 ,最后按照抽取字符的对应数值组合排序 ,完成了藏语字词的排序 .该模型现已在Win dows平台上实现 .According to GB16959-1997 and ISO/IEC 10646-1:1993 of coded character set for Tibetan information processing, there is an engineering need for applying the set to all kinds of software and databases, in which sorting is an important technology. As Tibetan sorting involves construction order, classes of constitution and character sequence in the dictionary order, A Written Tibetan word has an inconceivably complex structure with multi-hierarchies. The paper makes an exhaustive analysis to the structures of words, the order of construction categories, and the sequence of characters in each structural position, as well as the length of words and the hierarchies of vertical composition stacks, and then establishes a sorting mathematical model. On the basis of the analysis, the paper assigns distinctive values to all existing characters with numerals in a word, then step by step identifies each character in the words with special algorithm and match it with character-numeral lists. At last, the paper combines all the values extracted from characters of words and compares different combination to make an ordered arrangement for any words in Tibetan language. This processing strategy has been accomplished in Windows 2000/NT Operating System.
关 键 词:藏字 结构序 构造级 字符序 计算机排序 数学模型
分 类 号:TP317.2[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.21.241.17