MT-Oriented English PoS Tagging and Its Application to Noun Phrase Chunking  

MT-Oriented English PoS Tagging and Its Application to Noun Phrase Chunking

在线阅读下载全文

作  者:Ma Jianjun Huang Degen Liu Haixia Sheng Wenfeng 

机构地区:[1]Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116023, Peoples R China [2]Dalian Univ Technol, Sch Foreign Languages, Dalian 116023, Peoples R China

出  处:《China Communications》2012年第3期58-67,共10页中国通信(英文版)

基  金:supported by the National Natural Science Foundation of China under Grant No.61173100;the Fundamental Research Funds for the Central Universities under Grant No.GDUT10RW202

摘  要:A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation.A small size of 998 k English annotated corpus in business domain is built semi-automatically based on a new tagset;the maximum entropy model is adopted,and rule-based approach is used in post-processing.The tagger is further applied in Noun Phrase(NP) chunking.Experiments show that our tagger achieves an accuracy of 98.14%,which is a quite satisfactory result.In the application to NP chunking,the tagger gives rise to 2.21% increase in F-score,compared with the results using Stanford tagger.A hybrid approach to English Part-of- Speech (PoS) tagging with its target application be- ing English-Chinese machine translation in business domain is presented, demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation. A small size of 998 k English annotated corpus in business domain is built semi- automatically based on a new tagset; the maximum entropy model is adopted, and rule-based approach is used in post-processing. The tagger is further ap- plied in Noun Phrase (NP) chunking. Experiments show that our tagger achieves an accuracy of 98.14%, which is a quite satisfactory result. In the application to NP chunking, the tagger gives rise to 2.21% increase in F-score, compared with the re- sults using Stanford tagger.

关 键 词:English PoS tagging maximum entro- py rule-based approach machine translation NP chunking 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] O151.21[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象