心理因素与学业表现:机器学习分类预测模型  

Using Demographic Information,Psychological Assessment Data and Machine Learning to Predict Students’Academic Performance

在线阅读下载全文

作  者:丁欣放 聂晶[2] 张斌[3] Ding Xinfang;Nie Jing;Zhang Bin(School of Medical Humanities,Capital Medical Univasity,Beijing,100069;Student Counseling and Mental Health Center,Peking University,Beijing,100871;Student Afiairs Dqwrtment,Capital Medical University,Beijing,100069)

机构地区:[1]首都医科大学医学人文学院,北京100069 [2]北京大学学生心理健康教育与咨询中心,北京100871 [3]首都医科大学学生处,北京100069

出  处:《心理科学》2021年第2期330-339,共10页Journal of Psychological Science

基  金:supported by the Social Science Research Common Program of Beijing Municipal Commission of Education(SM201810025001)。

摘  要:随着高等教育规模的扩大,学业表现不良逐渐成为一个不容忽视的现象,对学业表现不良的学生进行预测并提早给予干预可降低退学率并减少教育资源的损失。由于导致学业表现不良的因素众多且关系复杂,传统的基于相关分析的研究方法很难建立早期预测模型并进行应用。本研究旨在利用机器学习算法,对数据进行挖掘,并建立学业表现预测模型。研究对653名大一新生的心理健康状况、应对方式、人格、内外控倾向和相关人口统计学信息进行了收集,并在一年后采集了其学业成绩,利用随机森林(RF)、K邻近(KNN)、支持向量机(SVM)、决策树(DT)、朴素贝叶斯(NB)等机器学习算法建立了分类模型。结果显示,随机森林算法在识别学业表现不良学生时有最好的表现,其中准确率95.86%,召回率91.83%,f1分数为93.80%。特征重要性分析显示,前10个对模型有最高贡献度的特征包括:年龄、性别、是否为独生子、内外控倾向、神经质倾向、积极应对倾向、宜人性倾向、一般症状指数、开放性倾向和焦虑水平。为避免过度拟合问题,本研究在一年后收集的166名新生样本中进行了模型验证,结果显示模型在新数据集上有较好的泛化表现,其中f1分数90.90%,准确率92.60%,召回率89.26%。研究提示基于人口统计学和心理测评信息,机器学习算法有助于及早识别学业表现不良学生并为开展早期干预提供启示。Tracking college students’academic performance and predicting students who will be likely to fail courses are important to providing early intervention and increasing retention rates.Previous studies have found that many psychological factors are correlated with academic marks,including personality,coping styles,mental health and academic and social motivational constructs.However,the traditional way of studying correlational factors often fails in providing an early prediction model since the mechanism underlying poor academic performance is generally complicated and sometimes the patterns are even implicit.Machine learning is an approach that detects implicit patterns via algorithms and statistical models in the big data,which can optimize exploratory analysis by providing internal cross-validation and is more robust to outliers.The present study aimed at utilizing a machine learning approach involving demographic information and the results of psychological assessments as input to classify students who have failed courses from those who have not failed courses in their first year at college.Six hundred and fifty-three participants from five universities in northern China were recruited.They were required to complete demographic information survey,Symptom Checklist 90,Rotter Internal-External Locus of Control Scale,Trait Coping Style Questionnaire and The Big-Five Personality Inventory-10.Those questionnaires measured mental health,coping styles,personality and generalized control expectations on internal-external locus respectively.Academic performance information was collected one year later.The low performing students were defined as having at least one course failed in their first year at college.Five machine learning algorithms including Random Forests(RF),K-Nearest Neighbors(KNN),Support Vector Machine(SVM),Na?ve Bayes(NB)and Decision Tree(DT)were trained to build dichotomous classification model to detect low-performing students.The results showed that the highest classification f1 score was obtained b

关 键 词:学业表现 机器学习 预测 心理因素 分类预测模型 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] G642[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象