检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:霍紫莹 张敏强 薛琦 HUO Ziying;ZHANG Minqiang;XUE Qi(South China Normal University,Guangzhou 510631,China)
出 处:《中国考试》2021年第11期49-59,共11页journal of China Examinations
摘 要:运用实证方法,基于数据范式对高考语文作文经典评分量表与分等分项评分量表(第Ⅱ版)进行比较。以20位经验评分者的评分均值作为"准真分数"评价2个评分量表的评分误差,通过多侧面Rasch模型考察评分者效应与评分量表。结果显示:1)使用分等分项评分量表(第Ⅱ版)评分时,减轻了评分者的认知与心理负担,分数分布更合理,能够更好地区分不同写作能力水平的考生,评分者信度有较大提升,由于分数全距更宽阔,致使评分误差稍大于经典评分量表;2)使用经典评分量表与分等分项评分量表(第Ⅱ版)评分均不存在明显的评分者效应;3)分等分项评分量表(第Ⅱ版)在维度与评价指标的设置上较经典评分量表更为合理,计分量尺在估计考生能力时未出现倒挂现象,等级区分能力明显高于经典评分量表,对评分者进行有效的评分培训后更适用于常模参照测验。The national college entrance examination(NCEE)is a large-scale and high-stake test in China,where the Chinese writing accounts for the largest proportion of score among all item formats. The measurement of writing ability relies on the scoring scales and raters. Therefore, to find a practical, reliable,and valid scale is critical for both students and the education examination authorities.With the data-based paradigm, an empirical study is attempted to differentiate the traditional scoring scale and the graded-analytic scoring scale(2 nd ed.)for Chinese writing test of NCEE in this paper. The experiment process involved 20 experienced raters applying both two scales to the rating of writings in a Chinese practice test of NCEE, which were administered to 1, 772 students from Guangdong and Hainan provinces. The mean of 20 human scores was regarded as the true score approximately, in order to judge the performance of two scales. Rater effects and scoring scales were calibrated by many facets Rasch model.The results are illustrated as follows: 1)When graded-analytic scoring scale(2 nd ed.)is used, the cognitive and psychological burden of raters is greatly reduced, the score distributions are more reasonable,and different levels of writing ability can be better distinguished. Therefore, rater reliability is greatly improved. But the score deviation is slightly larger than that of the traditional scoring scale, probably due to the former showing wider score ranges. 2)No matter which scoring scale is adopted, there are no rater effects observed. 3)Comparing to the traditional scoring scale, the dimensions and indicators of the graded-analytic scoring scale(2 nd ed.)are more reasonable, and the step parameters are well ordered and distinguished,which indicates that the graded-analytic scoring scale(2 nd ed.)is more suitable to the norm-referenced test with trained raters.
关 键 词:高考语文 作文评分 评分量表 评分者效应 评分误差
分 类 号:G405[文化科学—教育学原理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.83.123