基于不平衡数据的公司破产预测研究  被引量:3

Research on Company Bankruptcy Prediction Based on Unbalanced Data

在线阅读下载全文

作  者:周文泳[1] 冯丽霞 段春艳 ZHOU Wenyong;FENG Lixia;DUAN Chunyan(School of Economics and Management,Tongji University,Shanghai 200092,China;School of Mechanical Engineering,Tongji University,Shanghai 201804,China)

机构地区:[1]同济大学经济与管理学院,上海200092 [2]同济大学机械与能源工程学院,上海201804

出  处:《同济大学学报(自然科学版)》2022年第2期283-290,共8页Journal of Tongji University:Natural Science

基  金:2020年度同济大学“双带头人”教师党支部书记学术能力提升计划项目;上海市浦江人才计划(20PJ1413700)。

摘  要:整合创新数据预处理技术与集成算法利用不平衡数据探讨了公司破产预测问题。首先,运用冗余信息处理方法、不同抽样方法等对不平衡数据进行预处理。其次,以5.0分类器(Classifier 5.0,C5.0)决策树和单隐层前馈神经网络作为基分类器,分别与三类重抽样数据预处理技术结合,择出最优抽样法。再次,结合自助汇聚法提升分类效果,并运用十折交叉验证的受试者操作特征曲线的下方面积进行评价,对比了两基分类器的集成模型。最后,运用加利福尼亚大学尔湾分校数据库中一万多家波兰制造业公司的实际数据进行实验验证。实验结果表明:欠抽样或人工少数类过采样法与神经网络结合的集成模型分类效果最优,为企业实施破产预测提供积极支撑。This paper discusses the problem of corporate bankruptcy prediction using unbalanced data by innovatively integrating data preprocessing technology and integration algorithm.Firstly,redundant information processing and different sampling methods are used to preprocess unbalanced data.Secondly,a decision tree with Classifier 5.0(C5.0)and a single hidden layer feedforward neural network are used as the base classifier to select the optimal sampling method by combining with three kinds of resampling data preprocessing technologies.Thirdly,the self-aggregation method is combined to improve the classification performance,and the integration models of the two base classifiers are compared by the area under the receiver operating characteristic curve with 10-fold cross-validation.Finally,the actual data of more than 10000 Polish manufacturing companies in the database of University of California Irvine are used for experimental verification.The experimental results show that the integrated model combining under-sampling or synthetic minority over-sampling method with neural network archive the best classification performance,which provides positive support for the enterprises to implement bankruptcy prediction.

关 键 词:二元分类 不平衡数据 神经网络 C5.0决策树 集成方法 

分 类 号:F272[经济管理—企业管理] TP183[经济管理—国民经济]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象