检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:江颉[1] 王卓芳[1] GONG Rong-sheng 陈铁明[1]
机构地区:[1]浙江工业大学计算机科学与技术学院,杭州310023 [2]美国辛辛那提大学智能系统实验室,辛辛那提45221
出 处:《计算机科学》2013年第4期131-135,共5页Computer Science
基 金:国家自然科学基金(61103044);浙江省自然科学基金(Y1110567);浙江省科技厅计划项目(2010C31126;2011C21046)资助
摘 要:直接将传统的分类方法应用于不平衡数据集时,往往导致少数类的分类精度低下。提出一种基于K-S统计的不平衡数据分类方法,以有效提高少数类的识别率。利用K-S统计评估分类与特征之间的关系,去除冗余特征,并且构建K-S决策树获得数据分片,调整数据的不平衡度;最后对分片数据双向抽样调整,进行分类学习。该方法使用的K-S统计假设条件极易满足,其效率高且适用性强。通过KDD99入侵检测数据的分析对比表明,对于不平衡的数据集,该方法对多数类及少数类都具有较高的分类精度。The traditional classification algorithms always have low classification accuracy rate especially for the minorityclass when they are directly employed on classifying imbalanced datasets.A K-S statistic based new classification method for imbalanced data was proposed to enhance the performance of minority class recognition.At first,the K-S statistic was employed as a correlation measure to remove redundant variables.Then a K-S based decision tree was built to segment the training data into several subsets.Finally,two-way resampling methods,forward and backward,were used to rebuild the segmentation datasets as to implement more reasonable classification learning.The proposed K-S based method,with a realistic assumption,is very high efficient and widely applicable.The KDD99 intrusion detection experimental analysis proves that the method has high classification accuracy rate of both minority and majority class for imbalanced datasets.
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38