检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Jyoti Arora Meena Tushir Keshav Sharma Lalit Mohan Aman Singh Abdullah Alharbi Wael Alosaimi
机构地区:[1]Department of Information Technology,MSIT,GGSIPU,New Delhi,110058,India [2]Department of Electrical and Electronic Engineering,MSIT,GGSIPU,New Delhi,110058,India [3]School of Computer Science and Engineering,Lovely Professional University,144411,Punjab,India [4]Department of Information Technology,College of Computers and Information Technology,Taif University,11099,Taif 21944,Saudi Arabia
出 处:《Computers, Materials & Continua》2022年第12期4801-4817,共17页计算机、材料和连续体(英文)
基 金:This research was supported by Taif University Researchers Supporting Project number(TURSP-2020/254),Taif University,Taif,Saudi Arabia.
摘 要:Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challenging research problem.Various machine learning techniques are designed to operate on balanced datasets;therefore,the state of the art,different undersampling,over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets,but highly skewed datasets still pose the problem of generalization and noise generation during resampling.To overcome these problems,this paper proposes amajority clusteringmodel for classification of imbalanced datasets known as MCBC-SMOTE(Majority Clustering for balanced Classification-SMOTE).The model provides a method to convert the problem of binary classification into a multi-class problem.In the proposed algorithm,the number of clusters for themajority class is calculated using the elbow method and the minority class is over-sampled as an average of clustered majority classes to generate a symmetrical class distribution.The proposed technique is cost-effective,reduces the problem of noise generation and successfully disables the imbalances present in between and within classes.The results of the evaluations on diverse real datasets proved to provide better classification results as compared to state of the art existing methodologies based on several performance metrics.
关 键 词:Imbalance class problem CLASSIFICATION SMOTE K-MEANS CLUSTERING sampling
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170