基于SMOTE-AdaBoost-DT的类别不平衡信用评分模型  被引量:2

A SMOTE-AdaBoost-DT model for credit scoring

在线阅读下载全文

作  者:赵佳丽 徐明江 吴增源 郑素丽 ZHAO Jiali;XU Mingjiang;WU Zengyuan;ZHENG Suli(College of Economics and Management,China Jiliang University,Hangzhou 310018,China;Hangzhou Qiandao Lake Development Group Co.,Ltd.,Hangzhou 311799,China)

机构地区:[1]中国计量大学经济与管理学院,浙江杭州310018 [2]杭州千岛湖发展集团有限公司,浙江杭州311701

出  处:《中国计量大学学报》2021年第4期549-554,共6页Journal of China University of Metrology

基  金:国家自然科学基金项目(No.71572187);浙江省自然科学基金项目(No.LY20G010008)。

摘  要:目的:针对信用评分样本类别不平衡问题,提出一种新的分类方法——合成少数类过采样技术-自适应增强-决策树(SMOTE-AdaBoost-DT)模型。方法:首先,利用SMOTE生成少数类样本,降低数据的不平衡性;其次,利用以DT为基分类器的AdaBoost算法对数据进行分类预测;最后,选取Kaggle平台上的信贷数据集进行实证检验。结果:以AUC和G-mean作为分类评价指标,SMOTE-AdaBoost-DT模型的AUC均值为89.19%,G-mean均值为89.09%,优于决策树、随机森林、AdaBoost和神经网络等算法,且指标的标准差最小。结论:本文提出的模型不仅能提高客户信用评分的准确度,而且可以提高模型的稳定性。Aims:According to imbalanced classification,a new ensemble classification model is proposed,which integrates the synthetic minority oversampling technique(SMOTE)and the Adaptive Boosting algorithm(AdaBoost)cascading multiple Decision Trees(DT).Methods:Firstly,SMOTE was used to generate some minority samples to keep balanced data distribution.Secondly,the AdaBoost algorithm with multiple DTs was employed to predict the credit score.Finally,the credit dataset on Kaggle was used to test the effectiveness of our model.Results:The area under the curve(AUC)of the SMOTE-AdaBoost-DT model was 89.19%;and the G-mean was 89.09%.Both were better than other algorithms,including DT,Random Forest,AdaBoost and Backpropagation Neural Networks.Meanwhile,the standard deviation was the smallest.Conclusions:The proposed model is good and stable.

关 键 词:信用评分 SMOTE技术 集成学习 不平衡分类 

分 类 号:F832.4[经济管理—金融学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象