面向英、汉跨语言研究的自动依存句法分析工具信度研究  被引量:4

An Investigation into the Reliability of Automatic Dependency Parsing Tools in Chinese-English Cross Language Studies

在线阅读下载全文

作  者:刘鼎甲 张子嬿 Liu Ding-jia;Zhang Zi-yan(National Research centre for Foreign Language Education/National Research Centre for State Language Capacity,Beijing Foreign Studies University,Beijing 100089,China)

机构地区:[1]北京外国语大学中国外语与教育研究中心/国家语言能力发展研究中心,北京100089

出  处:《外语学刊》2021年第6期9-16,共8页Foreign Language Research

基  金:国家社科基金项目“面向多语—汉语平行语料库的加工、检索与数据分析联合平台建设”(20BYY100);中央高校基本科研业务费北京外国语大学校级双一流建设科研项目“基于英、汉双语树库的句法量化特征的建模与判定问题研究”和北京外国语大学新入职教师科研启动基金“面向多语—汉语平行语料库的加工、检索与数据分析平台建设”的阶段性成果。

摘  要:近年来,句法分析被广泛应用于语言研究,尤其是随着语料数据的成倍增长,自动分析方法和工具的运用更显重要。然而,原本用于自然语言处理研究的自动句法分析方法和工具的适用性、准确性学界尚不了解,尤其在跨语言、跨文体研究中的适用性和特征的显著性未加检验,使得研究者不敢贸然使用,因而自动句法分析在实证语言研究中的信度是问题的关键。为此,本文考察和比较当前3种主流的句法分析工具Stanford Parser, Mate Parser和Malt Parser用于英、汉语言自动句法分析的准确性,并在此基础上以科技、新闻、社会科学和文学文体为例,在依存句法框架下对英语源语、翻译汉语与原创汉语的差异性进行考察,借以讨论依存句法分析方法在跨语言、跨文体研究中的适用性和特征的显著性。In recent decades, syntactic analysis is widely adopted in language studies. Automatic syntax analysis methods and tools are indispensable and irreplaceable when processing millions of words and sentences in language data. However, the reliability of current syntax analysis methods and the accuracy of automatic parsing tools is still left undiscussed, which may hinder the resear-chers to use the tools for the processing of large amount of data. Therefore, the reliability of syntactic parsing is critical to the above issue. This paper attempts to first compare the accuracy of three popular parsers: Stanford Parser, Mate Parser and Malt Parser for the automatic syntactic parsing of English and Chinese. In the final step, this article investigates the differences between English as source and Chinese as target language as well as original Chinese in the framework of dependency grammar towards science, news, social and literature texts. The result demonstrates that significance of syntactic features and thus also prove the applicability of dependency syntax analysis methods in cross-language and cross-genre language studies.

关 键 词:句法分析 信度 语料库 跨语言 实证研究 

分 类 号:H030[语言文字—语言学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象