基于分类不确定性最小化的半监督集成学习算法  被引量:2

Classification Uncertainty Minimization-based Semi-supervised Ensemble Learning Algorithm

在线阅读下载全文

作  者:何玉林 朱鹏辉 黄哲学 Fournier-Viger PHILIPPE[2] HE Yulin;ZHU Penghui;HUANG Zhexue;Fournier-Viger PHILIPPE(Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ),Shenzhen,Guangdong 518107,China;College of Computer Science&Software Engineering,Shenzhen University,Shenzhen,Guangdong 518060,China)

机构地区:[1]人工智能与数字经济广东省实验室(深圳),广东深圳518107 [2]深圳大学计算机与软件学院,广东深圳518060

出  处:《计算机科学》2023年第10期88-95,共8页Computer Science

基  金:国家自然科学基金面上项目(61972261);广东省自然科学基金面上项目(2023A1515011667);深圳市基础研究重点项目(JCYJ20220818100205012);深圳市基础研究面上项目(JCYJ20210324093609026)。

摘  要:半监督集成是将半监督学习与集成学习相结合的一种学习范式,它一方面通过无标记样本来提高集成学习的多样性,同时解决集成学习样本量不足的问题,另一方面集成多个分类器能够进一步提升半监督学习模型的性能。现有的研究从理论和实践两个角度证明了半监督学习与集成学习之间的互益性。针对当前半监督集成学习算法对无标记样本信息利用不完全的缺陷,文中提出了一种新的基于分类不确定性最小化的半监督集成学习(Classification Uncertainty Minimization-Based Semi-Supervised Ensemble Learning,CUM-SSEL)算法,它引入信息熵作为对无标记样本进行打标的置信度评判标准,通过最小化无标记样本打标过程中的不确定性迭代地训练分类器,实现对无标记样本的高效利用,以增强分类器的泛化性能。在标准的实验数据集上对CUM-SSEL算法的可行性、合理性和有效性进行了验证,实验表明:随着基分类器的增加,CUM-SSEL算法的训练呈现收敛的趋势,同时它能够获得优于Self-Training,Co-Training,Tri-Training,Semi-Boost,Vote-Training,Semi-Bagging以及CST-Voting算法的分类精度。Semi-supervised ensemble learning(SSEL)is a combinatorial paradigm by fusing semi-supervised learning and ensemble learning together,which improves the diversity of ensemble learning by introducing unlabeled samples and at the same time solves the problem of insufficient sample size for ensemble learning.In addition,SSEL can improve the generalization capability of classification system by integrating multiple classifiers trained on the highly-credible labeled samples.The existing researches have proved the mutual benefit between semi-supervised learning and integrated learning from both theoretical and practical perspectives.The existing SSEL algorithms are unable to make full use of the unlabeled samples,which limit their prediction capabi-lities when handling the classification problems with less labeled samples.This paper proposes a novel classification uncertainty minimization-based semi-supervised ensemble learning(CUM-SSEL)algorithm,which introduces the information entropy as the criterion of confidence and uses the characteristics of information entropy to minimize the classification uncertainty in the process of predicting unlabeled samples.The feasibility,rationality and effectiveness of CUM-SSEL algorithm are verified based on a series of persuasive experiments.Experimental results demonstrate that CUM-SSEL is a valid algorithm to deal with the semi-supervised learning problems.

关 键 词:半监督集成学习 集成学习 半监督学习 分类不确定性 置信度 信息熵 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象