多分类问题代价敏感AdaBoost算法  被引量:32

Cost-sensitive AdaBoost Algorithm for Multi-class Classification Problems

在线阅读下载全文

作  者:付忠良[1] 

机构地区:[1]中国科学院成都舟算机应用研究所,成都610041

出  处:《自动化学报》2011年第8期973-983,共11页Acta Automatica Sinica

基  金:国家高技术研究发展计划(863计划)(2008AAO1Z402);四川省科技支撑计划项目(2008SZ0100,2009SZ0214)资助~~

摘  要:针对目前多分类代价敏感分类问题在转换成二分类代价敏感分类问题存在的代价合并问题,研究并构造出了可直接应用于多分类问题的代价敏感AdaBoost算法.算法具有与连续AdaBoost算法类似的流程和误差估计.当代价完全相等时,该算法就变成了一种新的多分类的连续AdaBoost算法,算法能够确保训练错误率随着训练的分类器的个数增加而降低,但不直接要求各个分类器相互独立条件,或者说独立性条件可以通过算法规则来保证,但现有多分类连续AdaBoost算法的推导必须要求各个分类器相互独立.实验数据表明,算法可以真正实现分类结果偏向错分代价较小的类,特别当每一类被错分成其他类的代价不平衡但平均代价相等时,目前已有的多分类代价敏感学习算法会失效,但新方法仍然能实现最小的错分代价.研究方法为进一步研究集成学习算法提供了一种新的思路,得到了一种易操作并近似满足分类错误率最小的多标签分类问题的AdaBoost算法.To solve the cost merging problem when multi-class cost-sensitive classification is transferred to two-class cost-sensitive classification, a cost-sensitive AdaBoost algorithm which can be applied directly to multi-class classification is constructed. The proposed algorithm is similar to real AdaBoost algorithm in algorithm flow and error estimation formula. When the costs are equal, this algorithm becomes a new real AdaBoost algorithm for multi-class classification, guaranteeing that the training error of the combination classifier could be reduced while the number of trained classifiers increased. The new real AdaBoost algorithm does not need to meet the condition that every classifier must be independent, that is to say, the independent condition of classifiers can be derived from the new algorithm, instead of being the must for current real AdaBoost algorithm for multi-class classification. The experimental results show that this new algorithm always ensures the classification result trends to the class with the smallest cost, while the existing multi-class cost-sensitive learning algorithm may fail if the costs of being erroneously classified to other classes are imbalanced and the average cost of every class is equal. The research method above provides a new idea to construct new ensemble learning algorithms, and an AdaBoost algorithm for multi-label classification is given, which is easy to operate and approximately meets the smallest error classification rate.

关 键 词:代价敏感学习 多分类问题 多标签分类问题 连续ADABOOST 代价敏感分类 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象