RE-SMOTE:A Novel Imbalanced Sampling Method Based on SMOTE with Radius Estimation  

在线阅读下载全文

作  者:Dazhi E Jiale Liu Ming Zhang Huiyuan Jiang Keming Mao 

机构地区:[1]Shenyang Fire Science and Technology Research Institute,Ministry of Emergency Management of the People’s Republic of China,Shenyang,110034,China [2]College of Software,Northeastern University,Shenyang,110006,China

出  处:《Computers, Materials & Continua》2024年第12期3853-3880,共28页计算机、材料和连续体(英文)

基  金:supported by the National Key R&D Program of China,No.2022YFC3006302.

摘  要:Imbalance is a distinctive feature of many datasets,and how to make the dataset balanced become a hot topic in the machine learning field.The Synthetic Minority Oversampling Technique(SMOTE)is the classical method to solve this problem.Although much research has been conducted on SMOTE,there is still the problem of synthetic sample singularity.To solve the issues of class imbalance and diversity of generated samples,this paper proposes a hybrid resampling method for binary imbalanced data sets,RE-SMOTE,which is designed based on the improvements of two oversampling methods parameter-free SMOTE(PF-SMOTE)and SMOTE-Weighted Ensemble Nearest Neighbor(SMOTE-WENN).Initially,minority class samples are divided into safe and boundary minority categories.Boundary minority samples are regenerated through linear interpolation with the nearest majority class samples.In contrast,safe minority samples are randomly generated within a circular range centered on the initial safe minority samples with a radius determined by the distance to the nearest majority class samples.Furthermore,we use Weighted Edited Nearest Neighbor(WENN)and relative density methods to clean the generated samples and remove the low-quality samples.Relative density is calculated based on the ratio of majority to minority samples among the reverse k-nearest neighbor samples.To verify the effectiveness and robustness of the proposed model,we conducted a comprehensive experimental study on 40 datasets selected from real applications.The experimental results show the superiority of radius estimation-SMOTE(RE-SMOTE)over other state-of-the-art methods.Code is available at:https://github.com/blue9792/RE-SMOTE(accessed on 30 September 2024).

关 键 词:Imbalanced data sampling SMOTE radius estimation 

分 类 号:TP182.2[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象