基于GPT-4的英语写作自动化评估探索——以雅思写作任务2为例

Exploration of automated L2 writing evaluation based on GPT-4--Taking IELTS Writing Task 2 as an example

作　　者：董艳云[1] 祁昕阳马晓梅[1] DONG Yanyun;QI Xinyang;MA Xiaomei

出　　处：《语言测试与评价》2024年第2期13-30,共18页Language Testing and Assessment

摘　　要：本研究旨在探索GPT-4用于小样本二语写作的评估能力,以雅思写作任务2为例,设计了包含六类指令的指令工程,通过数据分布、相关分析及一致性检验,逐步分析了GPT-4在不同指令窗口下的评分性能在实验集上的表现。结果发现:第一,“最简+标准+样例”指令为最佳,并在验证集上再次得到验证。在最佳指令下,GPT-4的评分与考官评分一致性较强,且具备强相关关系。第二,考官评价与评分标准和校标样例存在信息偏差,不宜作为指令资料,否则可能会对GPT-4形成干扰。本研究期望能为GPT-4在教育环境中的写作评估应用提供实证支持,为进一步探索其在课堂环境中的实施提供基础。This study aims to explore the assessment capability of GPT-4 for small-sample L2 writing.Taking IELTS Writing Task 2 as an example,this research employs“prompt engineering”strategy and designs 6 distinct prompts.By examining data distribution,interrater correlation,and inter-rater agreement,this study analyzes the scoring performance of GPT-4 under different prompt windows.It is found that the“minimal+criteria+examples”prompt yields the best results,which is further verified on the test set.Under the optimal prompt,GPT-4’s scoring shows strong consistency with the examiner’s scores and exhibits a strong correlation.Additionally,an information discrepancy was found between the examiner’s comments and the scoring criteria and calibration examples.The examiner’s comments would potentially undermine GPT-4’s assessment capabilities,so it is not recommended to include them into the prompts.This study aspires to contribute empirical insights into the practical application of GPT-4 for writing evaluation in educational settings,offering a foundation for further exploration and implementation in classroom contexts.

关键词：GPT-4 雅思写作任务2 自动化作文评分评分员一致性

分类号：G63[文化科学—教育学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于GPT-4的英语写作自动化评估探索——以雅思写作任务2为例

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于GPT-4的英语写作自动化评估探索——以雅思写作任务2为例

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索