检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄皓 Huang Hao(Zhongshan Open University,Zhongshan,Guangdong 528403,China)
机构地区:[1]中山开放大学,广东中山528403
出 处:《计算机时代》2021年第9期22-25,共4页Computer Era
摘 要:在简繁汉字转换的处理中,一对多汉字消岐和避免分歧词过度转换是两大难题。构造一对多词表、通用词表、分歧词表,并在词表中加入转换的限制性规则,根据候选词最前或最后的一个字能否与相邻字另外组词,可对该词的有效性进行判断。使用词表中的规则对当前语句上下文进行匹配,综合分析名词、动词、量词和姓氏、词频等属性,从而实现消岐和转换的智能化。据此实现了一个简繁转换系统,实践证明这是一个行之有效的解决办法。One-to-many Chinese character disambiguating and avoiding excessive conversion of divergent words are two difficult problems in the conversion of simplified and traditional Chinese characters.One-to-many word list,common word list,divergent word list are constructed,and the restrictive rules of conversion are added to these word-lists.The validity of the candidate word can be judged according to whether the first or last character of the word can be combined with the adjacent word to form another word.The rules in word list are used to match the current sentence context,and the attributes of noun,verb,quantifier,and surname and word frequency are analyzed comprehensively,so as to realize the intelligence of ambiguity cancellation and conversion.With this method,a simplified and traditional Chinese character conversion system is realized,which proves its effectiveness.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.124