书面藏语排序的数学模型及算法  被引量:25

The Sorting Mathematical Model and Algorithm of Written Tibetan Language

在线阅读下载全文

作  者:江荻[1] 康才晙 

机构地区:[1]中国社会科学院民族学与人类学研究所计算语言学重点实验室,北京100081 [2]北京理工大学自动控制系,北京100081

出  处:《计算机学报》2004年第4期524-529,共6页Chinese Journal of Computers

基  金:国家自然科学基金 ( 60 173 0 2 4)资助

摘  要:针对中国国家标准及ISO藏文编码字符集提出书面藏语字词的排序涉及藏字结构序、构造级和字符序概念 ,是不同于中文、英文序性而性质独特的一种排序 .文章详尽分析了藏字字形、结构形态、传统字符顺序以及藏字字长和层高等特征 ,构建出藏语排序的数学模型 .然后依据模型要求为每类藏文符号进行数字赋值 ,通过算法逐步确定字符位置并识别字符 ,最后按照抽取字符的对应数值组合排序 ,完成了藏语字词的排序 .该模型现已在Win dows平台上实现 .According to GB16959-1997 and ISO/IEC 10646-1:1993 of coded character set for Tibetan information processing, there is an engineering need for applying the set to all kinds of software and databases, in which sorting is an important technology. As Tibetan sorting involves construction order, classes of constitution and character sequence in the dictionary order, A Written Tibetan word has an inconceivably complex structure with multi-hierarchies. The paper makes an exhaustive analysis to the structures of words, the order of construction categories, and the sequence of characters in each structural position, as well as the length of words and the hierarchies of vertical composition stacks, and then establishes a sorting mathematical model. On the basis of the analysis, the paper assigns distinctive values to all existing characters with numerals in a word, then step by step identifies each character in the words with special algorithm and match it with character-numeral lists. At last, the paper combines all the values extracted from characters of words and compares different combination to make an ordered arrangement for any words in Tibetan language. This processing strategy has been accomplished in Windows 2000/NT Operating System.

关 键 词:藏字 结构序 构造级 字符序 计算机排序 数学模型 

分 类 号:TP317.2[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象