检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韩璐[1] 韩立岩[2] HAN Lu HAN Li-yan(School of Management Science and Engineering, Central University of Finance & Economics, Beijing 100081, China School of Economics and Management, Beihang University, Beijing 100191, China)
机构地区:[1]中央财经大学管理科学与工程学院,北京100081 [2]北京航空航天大学经济管理学院,北京100191
出 处:《管理工程学报》2017年第2期128-136,共9页Journal of Industrial Engineering and Engineering Management
基 金:国家哲学社会科学基金青年资助项目(13CTJ004);国家自然科学基金重点资助项目(71232003);国家自然科学基金面上资助项目(71371022)
摘 要:虽然目前在实践中最常用的信用评分方法是逻辑回归,但研究的结果表明支持向量机在信用评分建模中是更为有效的方法。然而逻辑回归和支持向量机方法在高维数据分类问题上都面临着维度灾难的问题。正是基于以上原因,笔者提出了正交支持向量机的方法,并与目前常用的特征提取方法——主成分分析,逐步回归等在German信用卡数据集上进行了对比实验,交叉实验的结果表明正交支持向量机不论是在评分效果上还是评分效率上都有更好的表现。Researchers have found support vector machine can provide better performance in the prediction of credit scoring. However, support vector machineis a black-box method and lacks rules for selecting good input variables. Similar to other artificial intelligence methods, the black-box method faces the problemof"garbage in, garbage out". Thus SVM is usually troubled with dimension curse. Focusing on this, we introduce orthogonal feature extraction techniques withlogistic regression and support vector machine, which have better interpretability for the input variables, can reduce the dimension, and accelerate convergence. In this paper, we introduce a new way to address the orthogonal dimension reduction problem. We discuss the related properties of this method in detailand test it against other common statistical approaches, including principal component analysis and hybridizing logistic regression, to better solve and evaluatethe data. Because SVM is sensitive to initial condition and training algorithms, we use a grid search to select parameters for the SVM. Then, we design theprocess of reducing dimension with PCA, ODR, and HLG to reduce the redundant variables. Thus, the process can resolve the problem of high multicollinearityto a certain degree. With experiments on German data set, there is also an interesting phenomenon with respect to the use of support vector machine, which we define as'Dimensional interference'. Because Type I error is more serious in credit scoring and cannot be arbitrarily cut off, we also use the area under the ROC graph tojudge the effectiveness of different models. Based on the results of cross validation, it can be found that through the use of logistic regression filtering thedummy variables and orthogonal extracting feature, the support vector machine can not only reduce complexity and accelerate convergence, but also achievebetter performance.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.188