面向英语文章的词性标注算法  被引量:3

A Part-of-Speech Tagging Method for English Essay

在线阅读下载全文

作  者:谭咏梅[1] 吴坤[1] 

机构地区:[1]北京邮电大学智能科学与技术中心,北京100876

出  处:《北京邮电大学学报》2014年第6期120-124,共5页Journal of Beijing University of Posts and Telecommunications

基  金:国家自然科学基金项目(61273365)

摘  要:面向英语文章的词性标注是对英语文章实现自动批改的基础,虽然研究者对英语词性标注做了大量有益的研究,但是大多数的研究都面向英语为第一语言的用户,而面向英语为第二语言用户的相关研究则很少.为此,对以英语为第二语言用户的英语文章进行了人工标注,在此基础上提出了一种面向英语文章的词性标注算法,融合了词聚类、无标语料统计信息、单词发音等特征.实验结果表明,该算法能有效提高词性标注性能,标注正确率从94.49%可提高到97.07%.Part-of-speech tagging for Chinese English learner language is the base of automated essay scoring system. Much of fruitful part-of-speech tagging researches researchers was done,however,most of them are focused on the English essays written by native speaker,there is no research about essays of Chinese English learner. A corpus of Chinese English learner essay are annotated,and a part-of-speech tagging algorithm for Chinese English learner language is presented. This algorithm uses rich features,such as unsupervised word clusters,unsupervised tag dictionary and phonetic normalization. Based on these rich features,the system outperforms the state-of-art tagging on the corpus,and the tagging accuracy is raised from 94. 49% to 97. 07%.

关 键 词:词性标注 学生英语文章 特征 词聚类 

分 类 号:TN911.22[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象