一种新颖的词性标注模型  被引量:4

A Novel POS Tagging Model

在线阅读下载全文

作  者:袁里驰[1] 钟义信[1] 

机构地区:[1]北京邮电大学信息工程学院,北京100876

出  处:《微电子学与计算机》2005年第9期1-2,6,共3页Microelectronics & Computer

基  金:国家自然科学基金资助项目(69982001);国家"863计划"资助项目(2001AA114201)

摘  要:文章首次提出一种统计模型,即马氏族模型,该模型假定一个词出现概率既与当前词的词性标记有关,也与它前面的词有关,但其前面的词和该词词性标记关于该词条件独立。将马氏族模型适当加以简化,能成功地用于词性标记,实验结果证明:在相同的测试条件下,这种基于马氏族模型的词性标注方法标记成功率大大高于传统的基于隐马尔可夫模型的词性标注方法。马氏族模型在其它一些自然语言处理领域如分词、句法分析、语音识别、机器翻译也有广泛的应用前景。In this paper, the Markov Family Model, a kind of statistical Models was firstly introduced. Under the assumption that the probability of a word depends both on its own tag and previous word, but its own tag and previous word are independent if the word is known, we simplify the Markov Family Model and use for part-of-speech tagging successfully. Experimental results show that this part-of-speech tagging method based on Markov Family Model has greatly improved the precision comparing the conventional POS tagging method based on Hidden Markov Model under the same testing conditions. The Markov Family Model is also very useful in other natural language processing technologies such as word segmentation, statistical parsing, text-to-speech, optical character recognition, etc.

关 键 词:马氏族模型 词性标注 隐马尔可夫模型 VITERBI算法 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象