检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]合肥工业大学计算机与信息学院,安徽合肥230009
出 处:《微计算机信息》2010年第31期87-89,共3页Control & Automation
摘 要:未登录词识别是中文信息处理的一个难点,未登录词识别技术的突破对提高汉语自动分词和句法分析的准确性都有很重要的意义。在未登录词中,仅中文姓名就占15%之多,由此可见中文姓名识别对于未登录词识别乃至整个自动分词技术的重要性。本文设计了一种基于本体论和规则匹配的中文人名识别方法,首先基于本体构建中文人名层次分类体系,于分词的过程中指导源文本中候选人名的提取,然后根据规则库匹配修正候选人名,同时分析识别结果生成新的规则反馈给规则库。该方法能够对中文人名知识库进行有效的组织,同时具有一定的自学习的能力,可以获得比较好的中文人名识别效果。Identification of the unknown words in Chinese information processing is a hard nut to crack,the breakthroughs on the unknown word identification technology is very important to improve the accuracy of Chinese word segmentation and syntactic analysis. According to statistics,only the Chinese name account for as much as 15% of the unknown words. This shows the the importance of name recognition for Chinese unknown word identification as well as automatic segmentation.A method based on ontology and rule matching to identify Chinese names is proposed in this paper. Ontology-based Knowledge of the Chinese name is Constructed to guide the candidates recognition in the process of Chinese word segmentation. Then amend the Candidates according to rules system at the same time analyze the results to produce new rules and add them to rule system.
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.75.232