检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:方方 王昕[1] FANG Fang;WANG Xin(School of Applied Science,Beijing Information Science&Technology University,Beijing 100192,China)
出 处:《北京信息科技大学学报(自然科学版)》2021年第6期19-24,共6页Journal of Beijing Information Science and Technology University
基 金:国家自然科学基金资助项目(71501016)。
摘 要:针对交通事故中的事故类别不平衡现象,采用随机欠采样(random undersampling,RUS)结合极端梯度提升(extreme gradient boosting,XGBoost),建立一种基于RUS-XGBoost的类别不平衡事故风险预测模型。采取样本扰动、特征扰动和参数扰动方法构建具有差异性的子模型进行预测;用AUC和代价敏感错误率评价模型的预测效果,与其他模型比较验证其优越性;根据此模型计算的增益值探究影响事故风险的主要因素。使用英国政府公开的交通事故数据集进行实验表明,该模型预测效果优于单一Logistic回归、随机森林和XGBoost模型,以及Logistic回归集成、随机森林集成模型。Aiming at the category imbalance in traffic accidents,random undersampling combined with extreme gradient boosting were used to establish a category imbalance accident risk prediction model based on RUS-XGBoost.The methods of sample disturbance,characteristic disturbance and parameter disturbance were used to construct different sub models for prediction.AUC and cost-sensitive error rate were used to evaluate the prediction effect of the model,and its superiority was verified by comparison with other models.According to the gain value calculated by this model,the main factors affecting accident risk were explored.The traffic accident examples published by the British government show that the prediction effect of the model is better than that of single logistic regression,random forest and XGBoost model,as well as logistic regression integration and random forest integration model.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.185