检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《微电子学与计算机》2005年第9期1-2,6,共3页Microelectronics & Computer
基 金:国家自然科学基金资助项目(69982001);国家"863计划"资助项目(2001AA114201)
摘 要:文章首次提出一种统计模型,即马氏族模型,该模型假定一个词出现概率既与当前词的词性标记有关,也与它前面的词有关,但其前面的词和该词词性标记关于该词条件独立。将马氏族模型适当加以简化,能成功地用于词性标记,实验结果证明:在相同的测试条件下,这种基于马氏族模型的词性标注方法标记成功率大大高于传统的基于隐马尔可夫模型的词性标注方法。马氏族模型在其它一些自然语言处理领域如分词、句法分析、语音识别、机器翻译也有广泛的应用前景。In this paper, the Markov Family Model, a kind of statistical Models was firstly introduced. Under the assumption that the probability of a word depends both on its own tag and previous word, but its own tag and previous word are independent if the word is known, we simplify the Markov Family Model and use for part-of-speech tagging successfully. Experimental results show that this part-of-speech tagging method based on Markov Family Model has greatly improved the precision comparing the conventional POS tagging method based on Hidden Markov Model under the same testing conditions. The Markov Family Model is also very useful in other natural language processing technologies such as word segmentation, statistical parsing, text-to-speech, optical character recognition, etc.
关 键 词:马氏族模型 词性标注 隐马尔可夫模型 VITERBI算法
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28