An Ensemble Tree Classifier for Highly Imbalanced Data Classification  

在线阅读下载全文

作  者:SHI Peibei WANG Zhong 

机构地区:[1]School of Computer Science and Technology,Hefei Normal University,Hefei 230601,China

出  处:《Journal of Systems Science & Complexity》2021年第6期2250-2266,共17页系统科学与复杂性学报(英文版)

基  金:supported by the National Natural Science Foundation of China under Grant No.61976198;the Natural Science Research Key Project for Colleges and Universities of Anhui Province under Grant No.KJ2019A0726;the High-level Scientific Research Foundation for the Introduction of Talent of Hefei Normal University under Grant No.2020RCJJ44。

摘  要:The performance of traditional imbalanced classification algorithms is degraded when dealing with highly imbalanced data.How to deal with highly imbalanced data is a difficult problem.In this paper,the authors propose an ensemble tree classifier for highly imbalanced data classification.The ensemble tree classifier is constructed with a complete binary tree structure.A mathematical model is established based on the features and classification performance of the classifier,and it is proven that the model parameters of the ensemble classifier can be solved by calculation.First,the AdaBoost method is used as the benchmark classifier to construct the tree structure model.Then,the classification cost of the model is calculated,and the quantitative mathematical description between the cost and features of the ensemble tree classifier model is obtained.Then,the cost of the classification model is transformed into an optimization problem,and the parameters of the integrated tree classifier are given through theoretical derivation.This approach is tested on several highly imbalanced datasets in different fields and takes the AUC(area under the curve)and F-measure as evaluation criteria.Compared with the traditional imbalanced classification algorithm,the ensemble tree classifier has better classification performance.

关 键 词:Ensemble learning F-MEASURE imbalanced classification mathematical model 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论] TP181[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象