语料库研究的常用方法  被引量:1

Methodology of Corpus Research

在线阅读下载全文

作  者:孙若红[1] 刘岩[2] 

机构地区:[1]沈阳师范大学外国语学院,辽宁沈阳110034 [2]沈阳工程学院公共外语教学部,辽宁沈阳110136

出  处:《沈阳师范大学学报(社会科学版)》2016年第2期72-75,共4页Journal of Shenyang Normal University(Social Science Edition)

基  金:教育部人文社会科学研究规划基金项目(11YJA740078)

摘  要:语料库语言学中的量化不仅仅是语言特征的简单计数,而是对复杂的数据进行精确的数学分析,从杂乱的数据中寻找规律,力求比较确切地揭示不同体裁的文本、甚至是不同语言之间真正存在的差异。语料库相关研究中应用的基本方法主要有词语索引以及频数的标准化、卡方检验、Z值、T值和MI值计算等常用的统计方法。The quantitative research of corpus linguistics doesn't mean the simple counting of language characteristics. Rather, it refers to the precise mathematical analysis of data. The purpose of such research is to reveal the differences in language use between different genres of text or even different languages. The frequently used research methods include concordancing and statistical methods like standardized frequency, chi-square test, Z-score, T-score and MI-score. Concordance lines provide a variety of information about language use like "centrality"",typicality"and the sense differences between synonyms. MI-score, Z-score and T-score are usually used to calculate the strength of collocation, but they have their own advantages and disadvantages. MI-score and Z-score are biased towards low-frequency words, while T-score are biased towards high-frequency words. Therefore, in practice, research needs should be taken into account in the selection of statistical methods, and another way out is to employ different statistical methods.

关 键 词:词语索引 频数的标准化 卡方检验 Z值 T值 MI值 

分 类 号:H313[语言文字—英语]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象