检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Dazhi E Jiale Liu Ming Zhang Huiyuan Jiang Keming Mao
机构地区:[1]Shenyang Fire Science and Technology Research Institute,Ministry of Emergency Management of the People’s Republic of China,Shenyang,110034,China [2]College of Software,Northeastern University,Shenyang,110006,China
出 处:《Computers, Materials & Continua》2024年第12期3853-3880,共28页计算机、材料和连续体(英文)
基 金:supported by the National Key R&D Program of China,No.2022YFC3006302.
摘 要:Imbalance is a distinctive feature of many datasets,and how to make the dataset balanced become a hot topic in the machine learning field.The Synthetic Minority Oversampling Technique(SMOTE)is the classical method to solve this problem.Although much research has been conducted on SMOTE,there is still the problem of synthetic sample singularity.To solve the issues of class imbalance and diversity of generated samples,this paper proposes a hybrid resampling method for binary imbalanced data sets,RE-SMOTE,which is designed based on the improvements of two oversampling methods parameter-free SMOTE(PF-SMOTE)and SMOTE-Weighted Ensemble Nearest Neighbor(SMOTE-WENN).Initially,minority class samples are divided into safe and boundary minority categories.Boundary minority samples are regenerated through linear interpolation with the nearest majority class samples.In contrast,safe minority samples are randomly generated within a circular range centered on the initial safe minority samples with a radius determined by the distance to the nearest majority class samples.Furthermore,we use Weighted Edited Nearest Neighbor(WENN)and relative density methods to clean the generated samples and remove the low-quality samples.Relative density is calculated based on the ratio of majority to minority samples among the reverse k-nearest neighbor samples.To verify the effectiveness and robustness of the proposed model,we conducted a comprehensive experimental study on 40 datasets selected from real applications.The experimental results show the superiority of radius estimation-SMOTE(RE-SMOTE)over other state-of-the-art methods.Code is available at:https://github.com/blue9792/RE-SMOTE(accessed on 30 September 2024).
关 键 词:Imbalanced data sampling SMOTE radius estimation
分 类 号:TP182.2[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222