检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Ma Jianjun Huang Degen Liu Haixia Sheng Wenfeng
机构地区:[1]Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116023, Peoples R China [2]Dalian Univ Technol, Sch Foreign Languages, Dalian 116023, Peoples R China
出 处:《China Communications》2012年第3期58-67,共10页中国通信(英文版)
基 金:supported by the National Natural Science Foundation of China under Grant No.61173100;the Fundamental Research Funds for the Central Universities under Grant No.GDUT10RW202
摘 要:A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation.A small size of 998 k English annotated corpus in business domain is built semi-automatically based on a new tagset;the maximum entropy model is adopted,and rule-based approach is used in post-processing.The tagger is further applied in Noun Phrase(NP) chunking.Experiments show that our tagger achieves an accuracy of 98.14%,which is a quite satisfactory result.In the application to NP chunking,the tagger gives rise to 2.21% increase in F-score,compared with the results using Stanford tagger.A hybrid approach to English Part-of- Speech (PoS) tagging with its target application be- ing English-Chinese machine translation in business domain is presented, demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation. A small size of 998 k English annotated corpus in business domain is built semi- automatically based on a new tagset; the maximum entropy model is adopted, and rule-based approach is used in post-processing. The tagger is further ap- plied in Noun Phrase (NP) chunking. Experiments show that our tagger achieves an accuracy of 98.14%, which is a quite satisfactory result. In the application to NP chunking, the tagger gives rise to 2.21% increase in F-score, compared with the re- sults using Stanford tagger.
关 键 词:English PoS tagging maximum entro- py rule-based approach machine translation NP chunking
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.48