临床医学专业(本科)水平测试的等值方法比较研究被引量：4

A comparative study of equating methods applied in standardized competence test for clinical medicine undergraduates

作　　者：张泉慧[1] 何惧[2] 任杰[3] 张颖[4] 卢燕[5] Zhang Quanhui;He Ju;Ren Jie;Zhang Ying;Lu Yan(Department of Information and Assessment,National Medicine Examination Center,Being 100097,China;National Medicine Examination Center,Being 100097,China;Institute of Language Testing and Talent Evaluation,Beijing Language and Culture University,Being 100083,China;Department of Examination Management,National Medicine Examination Center,Being 100097,China;Department of Development Research,National Medicine Examination Center,Being 100097,China)

机构地区：[1]国家医学考试中心信息评价部,北京100097 [2]国家医学考试中心,北京100097 [3]北京语言大学语言测试和人才测评研究所,北京100083 [4]国家医学考试中心考务管理部,北京100097 [5]国家医学考试中心发展研究部,北京100097

出　　处：《中华医学教育杂志》2022年第7期577-580,共4页Chinese Journal of Medical Education

摘　　要：目的基于经典测验理论(classical test theory,CTT)和项目反应理论(item response theory,IRT)下的等值方法对2个年度临床医学专业(本科)水平测试(简称学业水平测试)考生作答情况进行分析,探讨学业水平测试中更为适合的等值方法。方法基于CTT方法,采用塔克(Tucker)观察分数线性等值方法、列文(Levine)观察分数线性等值方法、等百分位法、等百分位平滑法4种方法,基于IRT方法的单参数、双参数模型中,采用分别估计法、同时估计法和固定共同题参数估计法各3种校准方法进行等值探索,通过等值标准误来分析以上10种等值结果的稳定性。结果CTT方法的等值误差在0.7~1.6之间,IRT方法的等值误差在0.2~0.6之间,IRT误差更小。CTT方法中,Tucker观察分数线性等值方法误差最小,为0.7,等百分位平滑法误差最大,为1.6;IRT方法中,单参数模型的等值结果优于双参数模型,单参数模型中,固定共同题参数估计法的误差最小,为0.2。结论学业水平测试等值可以选择IRT单参数模型中的固定共同题参数估计法,通过等值,年度2学业水平测试等值后的分数上调,合格标准保持不变,有效地实现了分数可比,保证了考试公平。Objective This paper analyzes equating methods applied in Standardized Competence Test for undergraduates of clinical medicine based on classical test theory(CTT)and item response theory(IRT)in order to explore a more suitable equating method.Methods The research uses four equating methods based on the CTT and six equating methods based on the IRT.CTT equating methods include Tucker observation score linear equating method,Levine observation score linear equating method,equipercentile equating smoothing method and equating standard error equating unsmoothed method.While in the one-parameter model and two-parameter model of IRT,three calibration methods are used which are linking separate calibration,concurrent calibration and fixed Item Parameter Calibration.The stability of the 10 equating results is analyzed by the equating standard error.Results The results show that the equating standard error of CTT method is 0.7~1.6,while the equating standard error of IRT method is 0.2~0.6,IRT equating standard error is smaller than CTT equating method.Among four CTT equating methods,the equating standard error of Tucker observation score linear equating method is 0.7 as the smallest one,the error of equipercentile equating method is 1.6 as the largest one.Among six IRT equating methods,the result of one-parameter model is better than that of two-parameter model and the error of fixed item parameter calibration is the smallest one in one-parameter model,which the equating standard error is 0.2.Conclusions The fixed item parameter calibration in one-parameter model of IRT can be selected as the equating method of this test.Through equating,the score of year 2 is improved,and the eligibility criteria remain unchanged,which effectively achieves the score comparability and ensures the fairness of the test.

关键词：临床医学专业水平测试经典测验理论项目反应理论等值

分类号：R-05[医药卫生]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

临床医学专业(本科)水平测试的等值方法比较研究被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

临床医学专业(本科)水平测试的等值方法比较研究 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

临床医学专业(本科)水平测试的等值方法比较研究被引量：4