正交支持向量机及其在信用评分中的应用  被引量:16

Orthogonal support vector machine and its application in credit scoring

在线阅读下载全文

作  者:韩璐[1] 韩立岩[2] HAN Lu HAN Li-yan(School of Management Science and Engineering, Central University of Finance & Economics, Beijing 100081, China School of Economics and Management, Beihang University, Beijing 100191, China)

机构地区:[1]中央财经大学管理科学与工程学院,北京100081 [2]北京航空航天大学经济管理学院,北京100191

出  处:《管理工程学报》2017年第2期128-136,共9页Journal of Industrial Engineering and Engineering Management

基  金:国家哲学社会科学基金青年资助项目(13CTJ004);国家自然科学基金重点资助项目(71232003);国家自然科学基金面上资助项目(71371022)

摘  要:虽然目前在实践中最常用的信用评分方法是逻辑回归,但研究的结果表明支持向量机在信用评分建模中是更为有效的方法。然而逻辑回归和支持向量机方法在高维数据分类问题上都面临着维度灾难的问题。正是基于以上原因,笔者提出了正交支持向量机的方法,并与目前常用的特征提取方法——主成分分析,逐步回归等在German信用卡数据集上进行了对比实验,交叉实验的结果表明正交支持向量机不论是在评分效果上还是评分效率上都有更好的表现。Researchers have found support vector machine can provide better performance in the prediction of credit scoring. However, support vector machineis a black-box method and lacks rules for selecting good input variables. Similar to other artificial intelligence methods, the black-box method faces the problemof"garbage in, garbage out". Thus SVM is usually troubled with dimension curse. Focusing on this, we introduce orthogonal feature extraction techniques withlogistic regression and support vector machine, which have better interpretability for the input variables, can reduce the dimension, and accelerate convergence. In this paper, we introduce a new way to address the orthogonal dimension reduction problem. We discuss the related properties of this method in detailand test it against other common statistical approaches, including principal component analysis and hybridizing logistic regression, to better solve and evaluatethe data. Because SVM is sensitive to initial condition and training algorithms, we use a grid search to select parameters for the SVM. Then, we design theprocess of reducing dimension with PCA, ODR, and HLG to reduce the redundant variables. Thus, the process can resolve the problem of high multicollinearityto a certain degree. With experiments on German data set, there is also an interesting phenomenon with respect to the use of support vector machine, which we define as'Dimensional interference'. Because Type I error is more serious in credit scoring and cannot be arbitrarily cut off, we also use the area under the ROC graph tojudge the effectiveness of different models. Based on the results of cross validation, it can be found that through the use of logistic regression filtering thedummy variables and orthogonal extracting feature, the support vector machine can not only reduce complexity and accelerate convergence, but also achievebetter performance.

关 键 词:正交支持向量机 维度灾难 逻辑回归 信用评分 

分 类 号:F224.9[经济管理—国民经济]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象