检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张泉慧[1] 何惧[2] 任杰[3] 张颖[4] 卢燕[5] Zhang Quanhui;He Ju;Ren Jie;Zhang Ying;Lu Yan(Department of Information and Assessment,National Medicine Examination Center,Being 100097,China;National Medicine Examination Center,Being 100097,China;Institute of Language Testing and Talent Evaluation,Beijing Language and Culture University,Being 100083,China;Department of Examination Management,National Medicine Examination Center,Being 100097,China;Department of Development Research,National Medicine Examination Center,Being 100097,China)
机构地区:[1]国家医学考试中心信息评价部,北京100097 [2]国家医学考试中心,北京100097 [3]北京语言大学语言测试和人才测评研究所,北京100083 [4]国家医学考试中心考务管理部,北京100097 [5]国家医学考试中心发展研究部,北京100097
出 处:《中华医学教育杂志》2022年第7期577-580,共4页Chinese Journal of Medical Education
摘 要:目的基于经典测验理论(classical test theory,CTT)和项目反应理论(item response theory,IRT)下的等值方法对2个年度临床医学专业(本科)水平测试(简称学业水平测试)考生作答情况进行分析,探讨学业水平测试中更为适合的等值方法。方法基于CTT方法,采用塔克(Tucker)观察分数线性等值方法、列文(Levine)观察分数线性等值方法、等百分位法、等百分位平滑法4种方法,基于IRT方法的单参数、双参数模型中,采用分别估计法、同时估计法和固定共同题参数估计法各3种校准方法进行等值探索,通过等值标准误来分析以上10种等值结果的稳定性。结果CTT方法的等值误差在0.7~1.6之间,IRT方法的等值误差在0.2~0.6之间,IRT误差更小。CTT方法中,Tucker观察分数线性等值方法误差最小,为0.7,等百分位平滑法误差最大,为1.6;IRT方法中,单参数模型的等值结果优于双参数模型,单参数模型中,固定共同题参数估计法的误差最小,为0.2。结论学业水平测试等值可以选择IRT单参数模型中的固定共同题参数估计法,通过等值,年度2学业水平测试等值后的分数上调,合格标准保持不变,有效地实现了分数可比,保证了考试公平。Objective This paper analyzes equating methods applied in Standardized Competence Test for undergraduates of clinical medicine based on classical test theory(CTT)and item response theory(IRT)in order to explore a more suitable equating method.Methods The research uses four equating methods based on the CTT and six equating methods based on the IRT.CTT equating methods include Tucker observation score linear equating method,Levine observation score linear equating method,equipercentile equating smoothing method and equating standard error equating unsmoothed method.While in the one-parameter model and two-parameter model of IRT,three calibration methods are used which are linking separate calibration,concurrent calibration and fixed Item Parameter Calibration.The stability of the 10 equating results is analyzed by the equating standard error.Results The results show that the equating standard error of CTT method is 0.7~1.6,while the equating standard error of IRT method is 0.2~0.6,IRT equating standard error is smaller than CTT equating method.Among four CTT equating methods,the equating standard error of Tucker observation score linear equating method is 0.7 as the smallest one,the error of equipercentile equating method is 1.6 as the largest one.Among six IRT equating methods,the result of one-parameter model is better than that of two-parameter model and the error of fixed item parameter calibration is the smallest one in one-parameter model,which the equating standard error is 0.2.Conclusions The fixed item parameter calibration in one-parameter model of IRT can be selected as the equating method of this test.Through equating,the score of year 2 is improved,and the eligibility criteria remain unchanged,which effectively achieves the score comparability and ensures the fairness of the test.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229