检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]南京航空航天大学自动化学院,江苏 南京 [2]南通思振电子科技有限公司,江苏 南通
出 处:《传感器技术与应用》2023年第6期538-549,共12页Journal of Sensor Technology and Application
摘 要:在数据挖掘领域,不平衡数据普遍存在。在许多情况下,这些数据通常具有高维性和类不平衡性。不平衡数据集特征属性分布失衡,会造成分类性能下降,数据的高维性则会导致学习算法非常耗时。针对这一问题,提出了一种基于组合采样和集成学习的特征选择方法。首先使用组合采样方法,处理类不平衡问题,重点合成少数类样本,在保证数据集达到平衡的前提下去除噪声样本,将集成特征选择建模为一个多准则决策过程,使用VIKOR方法得到特征重要性排序,然后在序列前向搜索特征的过程中,使用XGBoost算法的准确率作为评估特征子集优劣的指标,确定最优特征子集。选择AUC、G-mean和F-measure作为评价指标,通过在5组不平衡数据集进行实验,证实了所提算法具有更好的分类效果,且模型的鲁棒性更好。In the field of data mining, unbalanced data are prevalent. In many cases, these data are usually of high dimensionality and class imbalance. An unbalanced distribution of feature attributes in unbalanced datasets can cause degradation of classification performance, while the high dimensionality of the data can lead to very time-consuming learning algorithms. To address this problem, a feature selection method based on combinatorial sampling and integrated learning is proposed. Firstly, we use the combined sampling method to deal with the class imbalance problem, focus on synthesizing a few class samples, and remove the noise samples under the premise of ensuring that the dataset is balanced, model the integrated feature selection as a multi-criteria decision-making process, and use the VIKOR method to get the feature importance ranking, and then in the process of sequential forward searching for the features, we use the accuracy of the XGBoost algorithm as an indicator of the assessment of the feature subset’s The optimal feature subset is determined by using the index of superiority and inferiority. AUC, G-mean, and Fmeasure are chosen as the evaluation indexes, and the proposed algorithm is confirmed to have a better classification effect and better robustness of the model through the experiments in five unbalanced datasets.
关 键 词:不平衡数据分类 组合采样 多准则决策 VIKOR法 前向序列选择
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15