基于集成学习的不平衡交通事故风险研究  被引量:2

Research on unbalanced traffic accident risk based on ensemble learning

在线阅读下载全文

作  者:方方 王昕[1] FANG Fang;WANG Xin(School of Applied Science,Beijing Information Science&Technology University,Beijing 100192,China)

机构地区:[1]北京信息科技大学理学院,北京100192

出  处:《北京信息科技大学学报(自然科学版)》2021年第6期19-24,共6页Journal of Beijing Information Science and Technology University

基  金:国家自然科学基金资助项目(71501016)。

摘  要:针对交通事故中的事故类别不平衡现象,采用随机欠采样(random undersampling,RUS)结合极端梯度提升(extreme gradient boosting,XGBoost),建立一种基于RUS-XGBoost的类别不平衡事故风险预测模型。采取样本扰动、特征扰动和参数扰动方法构建具有差异性的子模型进行预测;用AUC和代价敏感错误率评价模型的预测效果,与其他模型比较验证其优越性;根据此模型计算的增益值探究影响事故风险的主要因素。使用英国政府公开的交通事故数据集进行实验表明,该模型预测效果优于单一Logistic回归、随机森林和XGBoost模型,以及Logistic回归集成、随机森林集成模型。Aiming at the category imbalance in traffic accidents,random undersampling combined with extreme gradient boosting were used to establish a category imbalance accident risk prediction model based on RUS-XGBoost.The methods of sample disturbance,characteristic disturbance and parameter disturbance were used to construct different sub models for prediction.AUC and cost-sensitive error rate were used to evaluate the prediction effect of the model,and its superiority was verified by comparison with other models.According to the gain value calculated by this model,the main factors affecting accident risk were explored.The traffic accident examples published by the British government show that the prediction effect of the model is better than that of single logistic regression,random forest and XGBoost model,as well as logistic regression integration and random forest integration model.

关 键 词:智能交通 交通事故风险预测 集成学习 代价敏感 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象