基于交叉组合重采样的拥挤识别方法  

Identification method of the congestion based on cross-portfolio resamples

在线阅读下载全文

作  者:郑长江[1] 王晨[1] 

机构地区:[1]河海大学土木与交通学院,江苏南京210098

出  处:《交通科学与工程》2014年第4期77-82,共6页Journal of Transport Science and Engineering

基  金:江苏省自然科学基金项目(BK2011745)

摘  要:针对拥挤数据分布不平衡问题,提出了一种新的重采样方法——交叉组合重采样法。该方法是将随机向下采样法与smote法相结合,对原始数据进行交叉采样,以减少采样法对原始数据的非均匀性破坏。通过仿真,得到比例为1∶10.1的非拥挤数据和拥挤数据原始样本。根据实际情况,通过交叉采样法,分别得到类比例为1∶5,1∶3以及1∶1的数据集,并对3种情况下的分类结果进行对比分析。选择朴素贝叶斯分类器、贝叶斯网络分类器及神经网络分类器,在不同比例数据集下,针对交叉组合重采样法和一般组合重采样法进行对比实验。实验结果证明:交叉组合重采样法能够更好地解决拥挤数据不平衡给分类器带来的问题。A new re-sampling method is paccording to the problems of crowded data dis-tribution imbalance-cross combinations resample method,which combines random sam-pling method downwards and smote method.The cross-sampling method is taken to deal with the original data and the damage of the original data caused by sampling meth-od is reduced in homogeneity.Non-crowding and congestion data sample data with the ratio of approximately 1∶10.1 is obtained through simulation.According to the actual situation,the data with the ratio of 1∶5 ,1∶3 and 1∶1 could be received with the meth-od of cross combinations resample,and the classification results are compared and ana-lyzed in these three cases.Finally,cross combinations resample method and common combinations resample method are compared in the case of different ratios with the naive Bayes classifier,and bayesian network classifiers and neural network classifiers are done.Through experimental verification,it is proved that the cross combinations resam-ple method could better solve the congestion data imbalance problem which brings to the classifier.

关 键 词:拥挤识别 不平衡分类 重采样方法 交叉组合 分类器 

分 类 号:U491.265[交通运输工程—交通运输规划与管理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象