检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:麦裕华 黎光明 钱扬义[2] Mai Yuhua;Li Guangming;Qian Yangyi
机构地区:[1]华南师范大学心理学院,广州510631 [2]华南师范大学化学学院,广州510631
出 处:《教育测量与评价》2020年第11期56-64,共9页Educational Measurement and Evaluation
摘 要:中学理科课程的实验操作考查是典型的表现性评价,主要评估学生完成理科常见实验的基本实验操作能力。为提高评分质量,优化实验操作考查的组织管理,以初三化学实验操作考查常见试题为例,应用多面Rasch模型探讨评分者效应和评分者信度。研究发现:(1)评分者不存在群体上的宽严效应、趋中效应、光环效应及区分性宽严效应,但表现出一定的随机效应,在同时考虑多侧面时有较弱的区分性宽严效应;(2)有可接受的评分者间信度和良好的评分者内信度;(3)与监考4位、6位考生相比,评分者监考2位考生时,较低评分者间信度出现的比例最大。建议在实施实验操作考查时,组织系统的考前评分实践培训,增加评分者对评分内容和过程,尤其是对不同类型评分者效应的一致性理解,提高个人准确评分的能力;可将多面Rasch模型作为评分质量控制的分析方法,用于评分结果的事后检查。A Chemistry experiment used in the experiment operation skill test of grade nine was selected as an example to explore the rater effects and rater reliability by applying the many-facet Rasch model(MFRM)to improve the quality of scoring,optimize and improve the organization and management of experimental operation skill test.The findings were as followed:(1)randomness effect and weak differential leniency/sverity effect existed in this study,leniency/sverity effect,central tendency effect and halo effect did not exist in group-level;(2)the inter-rater reliability was acceptable and the intra-rater reliability was well;(3)the percentage of low inter-rater reliability for examinee in the sub-group having two examinees was greater than that for examinee in the sub-group having four or six examinees.There are two suggestions:(1)organize systematic pre-examination scoring practice training to increase the understanding toward scoring content and process,especially the consistency of different types of raters effect,and to improve individual's ability to score accurately;(2)use MFRM as an analysis method of scoring quality control for the post-check of scoring results.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.59.186