基于“多层次分类”方法的异常P2P网贷借款识别  被引量:9

Detecting anomaly ioans on P2P lending platform:Based on hierarchical classification method

在线阅读下载全文

作  者:罗钦芳 丁国维 傅馨 蔡舜[1] 陈熹[2] LUO Qin-fang DING Guo-wei FU Xin CAI Shun CHEN Xi(School of Management, Xiamen University, Xiamen 361005, China School of Management, Zhejiang University, Hangzhou 310058, China)

机构地区:[1]厦门大学管理学院,福建厦门361005 [2]浙江大学管理学院,浙江杭州310058

出  处:《管理工程学报》2017年第3期201-209,共9页Journal of Industrial Engineering and Engineering Management

基  金:国家自然科学基金资助项目(71572166);国家自然科学基金资助项目(71372057);国家自然科学基金资助项目(71301133);厦门大学人文社科"校长基金-创新团队"基金资助项目(20720161044);教育部人文社会科学基金资助项目(13YJC630033)

摘  要:随着互联网技术的发展,P2P网络借贷的用户与数据量与日俱增。识别出异常的借款标的,促进平台的健康发展一直是社会关注的热点与焦点。针对这一问题,本文提出了"多层次分类"方法,以lending club发布的交易数据为研究对象,分层次进行数据分析。在第一层次,首先采用基于密度的DBSCAN聚类算法,排除大量正常用户,减弱数据中正负两类分布不均衡的缺陷;在第二层次,采用一般分类算法进行分类,最终识别出平台的异常借款标的。数值实验发现,将"多层次分类"方法应用在P2P网络借贷中,相比于其他方法,能在保证分类器整体性能的情况下,更有效地识别出异常还款的借款标的。With the development of information technology in recent years, financial service intermediaries have entered into the Internet era. As the most popular innovative business model of Internet finance, online peer-to-peer(P2P) lending has attractedincreased attention from diverse sections. The risk and safety are the main concerns in online P2 P lending industry. Apart from the risks from P2 P platforms themselves, risks arise from delinquent loans. Borrowers of these loans do not make their repayments on time and even default the loans, which lead to the loss of the lenders. Thus, it is essential to develop a model to detect these abnormal loans to protect lenders and platforms from risk. Based on the second-hand data of some P2 P platforms, several extant academic studies have investigated the risk issue by using methods including statistical approaches(e.g., logistic regression) and data mining approaches(e.g., classification). However, in online P2 P lending, the distribution of positive(abnormal loans) and negative(normal loans) samples is often imbalanced. Normal loans are the majority, while abnormal loans only account for a small percentage of loans. According to the data of the second quarter in 2016 from lending club, only 12.55% of loans are abnormal loans. To address this problem, we propose a hierarchical classification method in this paper. In different hierarchies, according to various characteristics of data set, the new model processes and analyzes data using different methods. In the first level, the unsupervised clustering method DBSCAN is used to fill outsome negative samples(normal loans) so that the distribution of positive and negative samples can be more balanced. In the second level, supervised classification methods, such as random forest and J48 decision tree, are used to perform classifications of the samples thatare filtered from the first hierarchy. Given the data of lending club, experiments were conducted in severalmodelsto detect abnormal loans, including fou

关 键 词:P2P网络借贷 异常检测 数据挖掘 多层次分类 

分 类 号:C37[社会学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象