检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:国强强 朱振方[1] GUO Qiang-qiang;ZHU Zhen-fang(Department of Information Science and Electrical Engineering,Shandong Jiaotong University,Jinan 250357,China)
机构地区:[1]山东交通学院信息科学与电气工程学院,山东济南250357
出 处:《计算机技术与发展》2020年第9期210-215,共6页Computer Technology and Development
基 金:国家社科基金(19BYY076);教育部人文社科基金(14YJC860042);山东省社科规划项目(19BJCJ51,18CXWJ01,18BJYJ04,17CHLJ07C)。
摘 要:随着科技进步、社会的发展,个人信用分值对于个人愈加重要,而传统的信用评分主要以个人消费能力等少数的维度来衡量,难以全面、客观、及时地反映个人的信用。旨在解决面向大样本、高维度数据的环境下的信用分预测问题,提出一种基于LightGBM算法的移动用户信用评分算法,完善信用评分体系。首先分析线性相关性来构建特征集合,然后通过K-means算法对特征集合进行聚类分析,最后通过LightGBM模型构建信用评分模型。通过在数字中国创新大赛所提供的真实数据上的实验表明,该方法能够充分挖掘数据特征并且精准地预测用户信用评分,较GBDT、XGBoost等算法具有较高的准确率和计算效率。通过对线性相关性分析基础上的数据特征集合进行聚类分析,并将其应用到基于LightGBM信用评分模型,能够更加准确地预测移动用户信用评分。With the progress of science and technology and the development of society,personal credit score is becoming more and more important to individuals.However,the traditional credit score is mainly measured by a few dimensions such as personal consumption ability,which is difficult to reflect personal credit comprehensively,objectively and timely.In order to address the problem of credit score prediction in the environment of large sample and high-dimensional data,we propose a mobile user credit score algorithm based on LightGBM algorithm to improve the credit scoring system.The linear correlation is firstly analyzed to construct feature sets,and then the K-means algorithm is used to analyze the clustering of feature sets.Finally,the credit scoring model is built by LightGBM model.Experiments on real data provided by the digital China innovation competition shows that the proposed method can fully mine data features and accurately predict user credit score,which is more accurate and efficient than GBDT,XGBoost and other algorithms.By clustering the data feature set based on linear correlation analysis and applying it to LightGBM credit scoring model,mobile users’credit scores can be predicted more accurately.
关 键 词:评分预测 LightGBM算法 K-MEANS算法 特征数据 线性相关性 随机森林 信用评分
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7