检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王泳欣 张大斌 车大庆 吕建秋 Wang Yongxin;Zhang Dabin;Che Daqing;Lyu Jianqiu(College of Mathematics and Informatics,South China Agricultural University;Guangdong Academy of Science and Technology Management and Planning,Guangzhou 510642,China)
机构地区:[1]华南农业大学数学与信息学院 [2]广东省科技管理与规划研究院,广州510642
出 处:《统计与决策》2022年第18期58-63,共6页Statistics & Decision
基 金:国家自然科学基金面上项目(71971089)。
摘 要:文章针对传统SMOTE及BSMOTE过采样方法会导致多数类样本识别率下降的问题,提出基于局部密度的改进BSMOTE算法(LDBSMOTE)。首先,根据样本分布特点计算局部密度值并筛选根样本,最大限度地保证具有潜在价值的样本不会被丢失,然后通过SMOTE合成样本,最后利用集成学习算法进行分类。为了验证LDBSMOTE的有效性对15个公共数据集进行实验,结果表明,相比SMOTE和BSMOTE,LDBSMOTE算法在F1、G-mean及AUC上平均提升了2.25%,且平均得分均为最高,能在保证多数类样本识别率的基础上提升少数类样本的识别率,有效提升分类性能。Aiming at the problem that traditional SMOTE and BSMOTE oversampling methods cause the recognition rate of majority samples to decrease,this paper proposes an improved BSMOTE algorithm based on local density(LDBSMOTE).Firstly,according to the characteristics of sample distribution,the local density value is calculated and root samples are screened to maximize the guarantee that samples with potential value will not be lost.Then SMOTE is adopted to synthesize the sample.Finally,ensemble learning algorithm is used for classification.In order to verify the effectiveness of the LDBSMOTE,experiments are conducted on 15 public data sets.The results show that compared with the SMOTE and BSMOTE,the LDBSMOTE algorithm has an average increase of 2.25% in F1,G-meanand AUC,and the average score is the highest,which can improve the recognition rate of minority samples on the basis of ensuring the recognition rate of the majority samples,and effectively improve the classification performance.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15