基于DBSCAN改进的SMOTE算法  

Improved SMOTE Algorithm Based on DBSCAN

在线阅读下载全文

作  者:邱灿华[1] 吴杰[1] QIU Canhua;WU Jie(Tongji University,Shanghai 200092,China)

机构地区:[1]同济大学,上海200092

出  处:《计算机与网络》2022年第7期62-66,共5页Computer & Network

摘  要:针对传统的合成少数类过采样技术(Synthetic Minority Oversampling Technique,SMOTE)中存在的忽略类间不平衡、类内不平衡、无法控制合成样本的噪声等问题,结合DBSCAN聚类算法,提出了一种基于DBSCAN改进的SMOTE算法:使用DBSCAN算法对少数类样本进行聚类,计算少数类密度系数和采用权重为每个簇分配采样数量,将每个簇中样本点按照到簇质心的距离分为2类,对每类中的样本点分配不同的随机系数进行过采样,得到新的较为平衡的数据集。根据获取的数据集进行实验表明,改进的算法可以很好地改善分类器的分类性能。In order to solve the problems in traditional Synthetic Minority Oversampling Technique(SMOTE)oversampling algorithms,such as ignoring the imbalance between and within classes and the inability to control the noise of synthesized samples,an improved SMOTE algorithm based on DBSCAN,combined with the DBSCAN clustering algorithm,is proposed.In this approach,DBSCAN algorithm is used to cluster the minority samples.The minority density coefficient is calculated,and the number of samples is assigned to each cluster through the weights.The sample points in each cluster are divided into two categories according to the distance to the cluster centroid.Different random coefficients are assigned to the sample points in each category for oversampling,and a new and more balanced data set is obtained as a result.The experiment proves that the improved algorithm makes the classification performance of the classifier much more better.

关 键 词:SMOTE算法 DBSCAN算法 不平衡数据集 过采样 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象