基于特征选择的高维数据集成学习方法研究被引量：6

Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data

作　　者：周钢郭福亮[1] ZHOU Gang;GUO Fu-liang(Naval University of Engineering,Wuhan 430033,China)

出　　处：《计算机科学》2021年第S01期250-254,共5页Computer Science

摘　　要：从集成学习的预测误差分析和偏差-方差分解可以发现使用有限的、具有正确率和差异性的基学习器进行集成学习,具有更好的泛化精度。利用信息熵构建了两阶段的特征选择集成学习方法,第一阶段先按照相对分类信息熵构建精度高于0.5的基特征集B;第二阶段先在B的基础上按互信息熵标准评判独立性,运用贪心算法构建独立的特征子集,再运用Jaccard系数评价特征子集间多样性,选取多样性的独立特征子集并构建基学习器。通过数据实验分析发现,该优化方法的执行效率和测试精度优于普通Bagging方法,在多分类的高维数据集上优化效果更好,但不适用于二分类问题。From the prediction error analysis and deviation-variance decomposition of ensemble learning,it can be found that the use of limited,accurate and differentiated basic learners for ensemble learning has better generalization accuracy.A two-stage feature selection ensemble learning method is constructed by using information entropy.In the first stage,the basic feature set B with accuracy higher than 0.5 is constructed according to the relative classification information entropy.In the second stage,independent feature subset is constructed by greedy algorithm and mutual information entropy criterion on the basis of B.Then Jaccard coefficient is used to evaluate the diversity among feature subsets,and the independent feature subset of diversity is selected and the basic learner is constructed.Through the analysis of data experiments,it is found that the efficiency and accuracy of the optimization method are better than the general Bagging method,especially in multi-classification high-dimensional datasets,the optimization effect is good,but it is not suitable for the two-classification problem.

关键词：集成学习多样性特征选择信息熵高维数据

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于特征选择的高维数据集成学习方法研究被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于特征选择的高维数据集成学习方法研究 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于特征选择的高维数据集成学习方法研究被引量：6