基于改进SMOTE自适应集成的信用风险评估模型  被引量:1

Self-adaptive integrated credit risk assessment model based on improved SMOTE

在线阅读下载全文

作  者:于勤丽 于海征[1] YU Qinli;YU Haizheng(College of Mathematics and System Science,Xinjiang University,Urumqi 830000,China)

机构地区:[1]新疆大学数学与系统科学学院,乌鲁木齐830000

出  处:《重庆理工大学学报(自然科学)》2022年第7期293-302,共10页Journal of Chongqing University of Technology:Natural Science

基  金:国家自然科学基金项目(61662079,11761070,U1703262);自治区自然科学基金联合项目(2021D01C078)。

摘  要:针对SMOTE等过采样方法对每个少数类合成相同数量新样本以及合成边界噪声样本的缺点,提出了一种改进的SMOTE过采样方法。为提高违约用户识别率,构建高效准确的信用风险评估模型,利用改进的SMOTE过采样方法对不平衡数据进行平衡化处理,并构建基于基模型差异性的Stacking集成模型识别违约用户。为解决Stacking模型容易出现过拟合的问题,同时最大程度保证模型的准确率,根据JC指标为模型自适应的选择基模型,既要保证准确率,又要存在一定的差异性。Lending Club数据集的实验结果表明,JC指标挑选出的基分类器所构成的Stacking集成模型性能更优。Aiming at the shortcomings of SMOTE and other oversampling methods that synthesize the same number of new samples for each minority class and synthesize boundary noise samples,this paper proposes an improved SMOTE oversampling method.In order to improve the identification rate of default users and build an efficient and accurate credit risk assessment model,an improved SMOTE oversampling method is proposed to balance the unbalanced data and build a Stacking integration model based on the difference of the base model to identify default users.The Stacking model is a high-performance integrated model that performs well in credit risk assessment.In order to solve the problem of stacking model prone to overfitting,and at the same time ensure the accuracy of the model to the greatest extent,it can be adaptive to the model according to the JC index when choosing a base model,the selected base model must not only ensure accuracy,but also have certain differences.The experimental results of the Lending Club data set show that the stacking ensemble model composed of the base classifiers selected by the JC indicator has better performance.

关 键 词:过采样 Stacking模型 自适应集成 不平衡数据集 

分 类 号:TP92[自动化与计算机技术] F830[经济管理—金融学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象