基于混合多样性生成与修剪的集成单类分类算法  被引量:9

Ensemble One-class Classifiers Based on Hybrid Diversity Generation and Pruning

在线阅读下载全文

作  者:刘家辰[1] 苗启广[1] 曹莹[1] 宋建锋[1] 权义宁[1] 

机构地区:[1]西安电子科技大学计算机学院,西安710071

出  处:《电子与信息学报》2015年第2期386-393,共8页Journal of Electronics & Information Technology

基  金:国家自然科学基金(61272280;41271447;61272195);教育部新世纪优秀人才支持计划(NCET-12-0919);中央高校基本科研业务费专项资金(K5051203020;K5051303016;K5051303018;BDY081422;K50513100006);西安市科技局项目(CXY1341(6))资助课题

摘  要:针对传统集成学习方法直接应用于单类分类器效果不理想的问题,该文首先证明了集成学习方法能够提升单类分类器的性能,同时证明了若基分类器集不经选择会导致集成后性能下降;接着指出了经典集成方法直接应用于单类分类器集成时存在基分类器多样性严重不足的问题,并提出了一种能够提高多样性的基单类分类器混合生成策略;最后从集成损失构成的角度拆分集成单类分类器的损失函数,针对性地构造了集成单类分类器修剪策略并提出一种基于混合多样性生成和修剪的单类分类器集成算法,简称为PHD-EOC。在UCI标准数据集和恶意程序行为检测数据集上的实验结果表明,PHD-EOC算法兼顾多样性与单类分类性能,在各种单类分类器评价指标上均较经典集成学习方法有更好的表现,并降低了决策阶段的时间复杂度。Combining one-class classifiers using the classical ensemble methods is not satisfactory. To address this problem, this paper first proves that though one-class classification performance can be improved by a classifier ensemble, it can also degrade if the set of base classifiers are not selected carefully. On this basis, this study further analyzes that the lacking of diversity heavily accounts for performance degradation. Therefore, a hybrid method for generating diverse base classifiers is proposed. Secondly, in the combining phase, to find the most useful diversity, the one-class ensemble loss is split and analyzed theoretically to propose a diversity based pruning method. Finally, by combining these two steps, a novel ensemble one-class classifier named Pruned Hybrid Diverse Ensemble One-class Classifier(PHD-EOC) is proposed. The experimental results on the UCI datasets and a malicious software detection dataset show that the PHD-EOC strikes a better balance between the diverse base classifiers and classification performance. It also outperforms other classical ensemble methods for a faster decision speed.

关 键 词:机器学习 单类分类 集成单类分类 分类器多样性 集成修剪 集成学习 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象