基于加速骨干二元粒子群优化的样本规约方法  

Sample Specification Method Based on Accelerated Backbone Binary Particle Swarm Optimization

在线阅读下载全文

作  者:罗少甫[1] 刘河 Luo Shaofu;Liu He(Department of Basic Sciences,Chongqing Aerospace Vocational and Technical College,Chongqing 400021,China;Chongqing Academy of Education Science,Chongqing 400015,China)

机构地区:[1]重庆航天职业技术学院基础学科部,重庆400021 [2]重庆市教育科学研究院,重庆400015

出  处:《统计与决策》2024年第22期59-64,共6页Statistics & Decision

基  金:重庆市教育委员会科学技术研究项目(KJQN202203007);重庆市教育科学规划项目(K22YG218233);重庆市科研院所绩效激励引导专项项目(cstc2022jxj10214);重庆市教委科学技术研究计划重点项目(KJZD-K202114401)。

摘  要:样本规约方法是统计机器学习中的杰出数据预处理范式,能从有标记训练集中移除冗余样本和噪声,从而提升分类统计算法的性能。虽然学者们提出了大量基于进化算法的样本规约方法,并证明了其有效性,但是现有基于进化算法的样本规约方法依赖太多参数。而且随着有标记训练集中的样本数量增加,现有基于进化算法的样本规约方法的搜索效率较低且时间成本较高。为了克服上述问题,文章提出一种基于加速骨干二元粒子群优化的样本规约方法(SRM-HBPSO)。在SRM-HBPSO中,首先,设计了一种结合搜索空间约简策略的加速骨干二元粒子群优化算法(HBPSO);其次,用HBPSO优化有标记训练集,从而得到一个被优化的约简子集;最后,SRM-HBPSO在被优化的约简子集上训练给定的分类统计算法,从而改进其性能。经仿真实验证明,就改进随机森林分类统计算法的平均分类正确率和提升平均样本约简率而言,在来自金融、医疗、图像等领域的10个真实基准数据集上,SRM-HBPSO优于5个先进的样本规约算法。Sample specification method is an outstanding data preprocessing paradigm in statistical machine learning,and it can be used to remove redundant samples and noise from labeled training sets,thus improving the performance of classification statistical algorithms.Although scholars have proposed a large number of sample specification methods based on evolutionary al-gorithms and proved their effectiveness,the existing sample specification methods based on evolutionary algorithms rely on too many parameters.Moreover,as the number of samples in labeled training sets increases,the existing sample specification meth-ods based on evolutionary algorithms have lower search efficiency and greater time overhead.In order to overcome these prob-lems,this paper proposes a sample specification method based on hybrid backbone binary particle swarm optimization(SRM-HB-PSO).In SRM-HBPSO,firstly,a hybrid backbone binary particle swarm optimization(HBPSO)algorithm combined with search space reduction strategy is designed.Then the labeled training set is optimized by HBPSO to obtain an optimized reduced subset.Finally,SRM-HBPSO trains a given classification statistical algorithm on the reduced subset that is optimized,thereby improving its performance.Simulation experiments show that,in terms of improving the average classification accuracy and improving the average sample reduction rate of the random forest classification statistical algorithm,SRM-HBPSO is superior to 5 advanced sample specification algorithms on 10 real benchmark data sets from the fields of finance,medical treatment and image.

关 键 词:统计机器学习 分类统计算法 样本规约 随机森林 搜索空间约简策略 

分 类 号:O212[理学—概率论与数理统计] TP391[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象