一种基于AP-Entropy选择集成的风控模型和算法  被引量:1

Risk Control Model and Algorithm Based on AP-Entropy Selection Ensemble

在线阅读下载全文

作  者:王茂光[1] 杨行 WANG Mao-guang;YANG Hang(School of Information,Central University of Finance and Economics,Beijing 100081,China)

机构地区:[1]中央财经大学信息学院,北京100081

出  处:《计算机科学》2021年第S02期71-76,80,共7页Computer Science

摘  要:近年来互联网金融网贷领域涌现出了众多的风控问题,对此采用多种特征选择方法预处理风控领域的数据指标,构建了全面的针对企业信用的风控指标体系,采用stacking集成策略研究了基于AP-Entropy的信用风险模型。信用风险模型有两层学习器,引入选择集成思想,从种类和数量上筛选基学习器。首先,在Logistic回归、反向传播神经网络、AdaBoost等经典机器学习算法中,采用AP聚类算法选出适合企业信用风险的异质学习器作为基学习器;其次,在每次学习器迭代中,利用熵对学习器择优,自动选出F1值最高的基学习器,其中改进基于熵的学习器选择算法,提升了基学习器选择过程的效率,降低了模型的计算成本,模型选取XGBoost作为次级基学习器。实验结果表明,文中提出的模型和其他模型相比具有更好的学习效果和更强的泛化能力。In recent years,many risk control problems have emerged in the field of Internet finance.For this,we adopt a variety of feature selection methods to preprocess data indicators in the field of risk control,and construct a comprehensive risk control indicator system for corporate credit.And we use stacking ensemble strategy to study credit risk model based on AP-entropy.There are two layers of learners in credit risk model.The idea of selection ensemble is introduced to select the base learners from the category and quantity.First,in machine learning algorithms such as Logistic regression,back propagation neural network,Ada-Boost,AP clustering algorithm is used to select a heterogeneous learner suitable for corporate credit risk as the base learner.Secondly,in each iteration of the learner,entropy is used to select the best learner,and the base learner with the highest F1 value is automatically selected.Among them,the improved algorithm based on entropy improves the efficiency of base learner selection process and reduces the computational cost of the model.Xgboost is selected as the secondary base learner.The empirical results show that the proposed model has good performance and generalization ability.

关 键 词:风控指标体系 stacking集成策略 AP-Entropy信用风险模型 选择集成 AP聚类算法 基于熵的学习器选择算法 XGBoost 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象