检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Hong Geun Ji Soyoung Oh Jina Kim Seong Choi Eunil Park
机构地区:[1]Department of Applied Artificial Intelligence,Sungkyunkwan University,Seoul,03063,Korea [2]Raon Data,Seoul,03073,Korea [3]Department of Computer Science and Engineering,University of Minnesota,Minneapolis,55455,MN,USA
出 处:《Computers, Materials & Continua》2022年第1期669-678,共10页计算机、材料和连续体(英文)
基 金:This work was supported by Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.2021-0-00358,AI·Big data based Cyber Security Orchestration and Automated Response Technology Development).
摘 要:In the field of natural language processing(NLP),the advancement of neural machine translation has paved the way for cross-lingual research.Yet,most studies in NLP have evaluated the proposed language models on well-refined datasets.We investigatewhether amachine translation approach is suitable for multilingual analysis of unrefined datasets,particularly,chat messages in Twitch.In order to address it,we collected the dataset,which included 7,066,854 and 3,365,569 chat messages from English and Korean streams,respectively.We employed several machine learning classifiers and neural networks with two different types of embedding:word-sequence embedding and the final layer of a pre-trained language model.The results of the employed models indicate that the accuracy difference between English,and English to Korean was relatively high,ranging from 3%to 12%.For Korean data(Korean,and Korean to English),it ranged from 0%to 2%.Therefore,the results imply that translation from a low-resource language(e.g.,Korean)into a high-resource language(e.g.,English)shows higher performance,in contrast to vice versa.Several implications and limitations of the presented results are also discussed.For instance,we suggest the feasibility of translation from resource-poor languages for using the tools of resource-rich languages in further analysis.
关 键 词:TWITCH MULTILINGUAL machine translation machine learning
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.248