IRT_Δb法和修正LR法对矩阵取样DIF检验的有效性  被引量:2

Applying IRT_ΔB Procedure and Adapted LR Procedure to Detect DIF in Tests with Matrix Sampling

在线阅读下载全文

作  者:张勋[1] 李凌艳[1] 刘红云[2] 孙研[1] 

机构地区:[1]北京师范大学认知神经科学与学习国家重点实验室,北京100875 [2]北京师范大学心理学院,北京100875

出  处:《心理学报》2013年第8期921-934,共14页Acta Psychologica Sinica

摘  要:矩阵取样测验包含多个题册,单个题册的总分不能直接作为匹配变量用于DIF检测。本研究首先基于模拟数据,同时采用IRT_Δb法,以及用IRT模型估计的考生能力作为匹配变量修订后的LR法对矩阵取样测验进行DIF检测,分析二者进行DIF检测的有效性及其相关影响因素;并根据已有的LR法DIF判断标准划定出IRT_Δb法分类标准;最后使用实证数据加以验证。结果显示:矩阵取样测验中,IRT_Δb法和修正LR法均能较好地区分DIF量不同的题目;样本量、题册中DIF题目的比例和考生群体间真实能力的差异对两种方法的检验力、犯I类错误的概率和分类结果都有较大影响。Matrix sampling is a useful technique widely used in large-scale educational assessments. In an assessment with matrix sampling design, each examinee takes one of the multiple booklets with partial items. A critical problem of detecting differential item functioning (DIF) in such scenario has gained a lot of attention in recent years, which is, it is not appropriate to take the observed total score obtained from individual booklet as the matching variable in detecting the DIF. Therefore, the traditional detecting methods, such as Mantel-Haenszel (MH), SIBTEST, as well as Logistic Regression (LR) are not suitable. IRT_△b might be an alternative due to its abilities to provide valid matching variable. However, the DIF classification criterion of IRT_△b was not well established yet. Thus, the purpose of this study were: 1) to investigate the efficiency and robustness of using ability parameters obtained from Item Response Theory (IRT) model as the matching variable, comparing with the way using traditional observed raw total scores ; 2) to further identify what factors will influence the abilities in detecting DIF of two methods; 3) to propose a DIF classification criteria for IRT_△b. Simulated and empirical data were both employed in this study to explore the robustness and the efficiency of the two prevailing DIF detecting methods, which were the IRT_△b method and the adapted LR method with the estimation of group-level ability based on IRT model as the matching variable. In the Monte Carlo study, a matrix sampling test was generated, and various experimental conditions were simulated as follows: 1) different proportions of DIF items; 2) different actual examinee ability distributions; 3) different sample sizes; 4) different size of DIF. Two DIF detection methods were then applied and results were compared. In addition, power functions were established in order to derive DIF classification rule for IRT Ab based on current rules for LR. In the empirical study, through cond

关 键 词:矩阵取样测验 项目功能差异 RASCH模型 LOGISTIC回归 

分 类 号:B841[哲学宗教—基础心理学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象