基于N元语法的汉语自动分词系统研究被引量：2

The Research of Chinese Automatic Word Segmentation System Based on N-Gram Statistical Model

出　　处：《微电子学与计算机》2009年第7期98-101,共4页Microelectronics & Computer

摘　　要：提出一种基于N元语法的汉语自动分词系统,将分词与标注结合起来,用词性标注来参与评价分词结果.首先基于词典和一元语法统计模型生成N个最优结果作为候选集;然后对候选集进行基于二元语法统计模型的词性标注,最后利用对文本的上下文"理解"信息来确定最佳切分结果.实验结果表明:此方法通过词性标注的反馈有效提高了分词正确率,词性标注对分词有反馈作用.This paper present an approach for Chinese word segmentation based on N-Gram statistical model. The method integrated the segmentation with Part Of Speech tagging, and evaluated the segmentation results by the latter. Firstly, the system generated the top N segmentation results as a candidate sets by the approach based on dictionary combined with uni-gram statistical model. Then, it used the method based on bi-gram statistical model to label the candidate sets. Lastly, the best segmentation result was gained depend on the text＇s contextual information. Experiments show that our method could efficiently improve the segmentation accuracy through the feedback of POS tagging.

关键词：一元语法二元语法中文分词词性标注

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于N元语法的汉语自动分词系统研究被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于N元语法的汉语自动分词系统研究 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于N元语法的汉语自动分词系统研究被引量：2