多级计分测验中基于残差统计量的被试拟合研究  被引量:2

Detection of aberrant response patterns using a residual-based statistic in testing with polytomous items

在线阅读下载全文

作  者:童昊 喻晓锋[1] 秦春影 彭亚风[1] 钟小缘 TONG Hao;YU Xiaofeng;QIN Chunying;PENG Yafeng;ZHONG Xiaoyuan(School of Psychology,Jiangxi Normal University,Nanchang,330022,China;School of Mathematics and Information Science,Nanchang Normal University,Nanchang 330032,China)

机构地区:[1]江西师范大学心理学院,南昌330022 [2]南昌师范学院数学与信息科学学院,南昌330032

出  处:《心理学报》2022年第9期1126-1140,共15页Acta Psychologica Sinica

基  金:全国教育科学规划项目(BGA210060);江西省社会科学基金项目(21JY06);江西省高校人文社会科学项目(XL20202);南昌市教育大数据智能技术重点实验室(2020−NCZDSY−012);江西省教育厅科技项目(GJJ191691,GJJ191128)资助。

摘  要:本文提出一种多级计分项目下的个人拟合统计量R,考察它在检测6种常见的异常作答模式(作弊、猜测、随机、粗心、创新作答、混合异常)下的表现,并与标准化对数似然统计量lzp进行比较。结果表明:(1)在异常作答覆盖率较低并且异常作答类型为作弊和猜测时,R的检测率显著高于lzp;(2)随着测验长度和被试异常程度的增加,两种统计量的检测率都会上升;(3)在一些条件下,R与lzp检测效果接近。实证数据分析进一步展示了R统计量的使用方法和过程,结果也表明R统计量具有较好的应用前景。Tests are widely used in educational measurement and psychometrics,and the examinee’s aberrant responses will affect the estimation of their abilities.These examinees with aberrant responses should not be treated with conventional methods,the important thing is to accurately screen them out of the normal group.To achieve this,a common method is to construct person-fit statistics to detect whether the response patterns fit their estimated abilities.In this study,a residual-based person-fit statistic R was proposed,which can be applied to both dichotomous or polytomous IRT models.The construction of R is based on a weighted residual between the observed response and the expected response.By accumulating the weighted residuals,the goodness of fit can be calculated and compared with a specific critical value to determine whether an examinee is aberrant or not.Given that tests with polytomous items can provide more information,polytomously scored items are being increasingly popular in educational measurement and psychometrics.The ability of R statistic to detect aberrant response patterns under the graded response model was mainly considered in this article.An existing polytomous person-ft statistic lzp was also introduced in its outstanding standardized form and superior power.In the first study,a simulation study was conducted to generate the empirical distribution of R statistic and lzp.R statistic is an accumulation of weighted residuals,showing a positive skew distribution;lzp shows a negative skew distribution when the test is less than 80 items.Both of them differ from the standard normal distribution,It is necessary to set critical value according to the type 1 error,using it to distinguish whether each respondent's response pattern is fitted.In the second study,examinees with different aberrant behaviors(e.g.,Cheaters,Lucky guessers,Random respondents,Careless respondents,Creative respondents and Mixed)under different test length conditions were simulated,and the detection rate as well as area under curve(

关 键 词:多级计分项目 项目反应理论 个人拟合统计量 异常行为检测 等级反应模型 

分 类 号:B841[哲学宗教—基础心理学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象