基于词表规则与语句上下文消歧的汉字简繁转换  

Simplified and traditional Chinese characters conversion with rules and context based ambiguity cancellation

在线阅读下载全文

作  者:黄皓 Huang Hao(Zhongshan Open University,Zhongshan,Guangdong 528403,China)

机构地区:[1]中山开放大学,广东中山528403

出  处:《计算机时代》2021年第9期22-25,共4页Computer Era

摘  要:在简繁汉字转换的处理中,一对多汉字消岐和避免分歧词过度转换是两大难题。构造一对多词表、通用词表、分歧词表,并在词表中加入转换的限制性规则,根据候选词最前或最后的一个字能否与相邻字另外组词,可对该词的有效性进行判断。使用词表中的规则对当前语句上下文进行匹配,综合分析名词、动词、量词和姓氏、词频等属性,从而实现消岐和转换的智能化。据此实现了一个简繁转换系统,实践证明这是一个行之有效的解决办法。One-to-many Chinese character disambiguating and avoiding excessive conversion of divergent words are two difficult problems in the conversion of simplified and traditional Chinese characters.One-to-many word list,common word list,divergent word list are constructed,and the restrictive rules of conversion are added to these word-lists.The validity of the candidate word can be judged according to whether the first or last character of the word can be combined with the adjacent word to form another word.The rules in word list are used to match the current sentence context,and the attributes of noun,verb,quantifier,and surname and word frequency are analyzed comprehensively,so as to realize the intelligence of ambiguity cancellation and conversion.With this method,a simplified and traditional Chinese character conversion system is realized,which proves its effectiveness.

关 键 词:简化字 繁体字 分词 词对照表 上下文 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象