检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:莫赞[1] 盖彦蓉 樊冠龙 MO Zan;GAI Yanrong;FAN Guanlong(School of Management,Guangdong University of Technology,Guangzhou Guangdong 510000,China;Department of Computer Science,Hong Kong Baptist University,Hong Kong 999077,China)
机构地区:[1]广东工业大学管理学院,广州510520 [2]香港浸会大学计算机系,香港999077
出 处:《计算机应用》2019年第2期618-622,共5页journal of Computer Applications
基 金:国家自然科学基金资助项目(711710);"十二五"国家科技支撑计划项目(2011BAD13B11);广东省海洋经济创新发展区域示范专项项目(GD2013-D01-001)~~
摘 要:针对传统单个分类器在不平衡数据上分类效果有限的问题,基于对抗生成网络(GAN)和集成学习方法,提出一种新的针对二类不平衡数据集的分类方法——对抗生成网络-自适应增强-决策树(GAN-AdaBoost-DT)算法。首先,利用GAN训练得到生成模型,生成模型生成少数类样本,降低数据的不平衡性;其次,将生成的少数类样本代入自适应增强(AdaBoost)模型框架,更改权重,改进AdaBoost模型,提升以决策树(DT)为基分类器的AdaBoost模型的分类性能。使用受测者工作特征曲线下面积(AUC)作为分类评价指标,在信用卡诈骗数据集上的实验分析表明,该算法与合成少数类样本集成学习相比,准确率提高了4. 5%,受测者工作特征曲线下面积提高了6. 5%;对比改进的合成少数类样本集成学习,准确率提高了4. 9%,AUC值提高了5. 9%;对比随机欠采样集成学习,准确率提高了4. 5%,受测者工作特征曲线下面积提高了5. 4%。在UCI和KEEL的其他数据集上的实验结果表明,该算法在不平衡二分类问题上能提高总体的准确率,优化分类器性能。Concerning that traditional single classifiers have poor classification effect for imbalanced data classification,a new binary-class imbalanced data classification algorithm was proposed based on Generative Adversarial Nets(GAN)and ensemble learning,namely Generative Adversarial Nets-Adaptive Boosting-Decision Tree(GAN-AdaBoost-DT).Firstly,GAN training was adopted to get a generative model which produced minority class samples to reduce imbalance ratio.Then,the minority class samples were brought into Adaptive Boosting(AdaBoost)learning framework and their weights were changed to improve AdaBoost model and classification performance of AdaBoost with Decision Tree(DT)as base classifier.Area Under the Carve(AUC)was used to evaluate the performance of classifier when dealing with imbalanced classification problems.The experimental results on credit card fraud data set illustrate that compared with synthetic minority over-sampling ensemble learning method,the accuracy of the proposed algorithm was increased by 4.5%,the AUC of it was improved by 6.5%;compared with modified synthetic minority over-sampling ensemble learning method,the accuracy was increased by 4.9%,the AUC was improved by 5.9%;compared with random under-sampling ensemble learning method,the accuracy was increased by 4.5%,the AUC was improved by 5.4%.The experimental results on other data sets of UCI and KEEL illustrate that the proposed algorithm can improve the accuracy of imbalanced classification and the overall classifier performance.
关 键 词:对抗生成网络 集成学习 不平衡分类 二分类 自适应增强 决策树 信用卡欺诈
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145