检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陶新民[1] 郝思媛[1] 张冬雪[1] 徐鹏[1]
机构地区:[1]哈尔滨工程大学信息与通信工程学院,黑龙江哈尔滨150001
出 处:《哈尔滨工程大学学报》2013年第3期381-388,共8页Journal of Harbin Engineering University
基 金:国家自然科学基金资助项目(61074076);中国博士后科学基金资金项目(20090450119);中国博士点新教师基金资助项目(20092304120017)
摘 要:针对传统SVM算法在失衡数据集下的分类性能不理想的问题,提出一种基于核聚类集成SVM算法.该算法首先在核空间中对多数类样本集进行聚类,然后随机选择出具有代表意义的聚类信息点,实现在减少多数类样本数的同时将分类界面向多数类样本方向偏移.并利用AdaBoost集成手段对基于核聚类的欠取样SVM算法进行集成,最终提高SVM算法在失衡数据下的泛化性能.将提出的算法同其他失衡数据预处理集成方法进行比较,实验结果表明该算法能够有效提高SVM算法在失衡数据中少数类的分类性能,且总体分类性能及运行效率都有明显提高.An ensemble support vector machine(SVM) based on kernel cluster was presented.Due to the fact that the traditional SVM algorithm's classification performance under unbalanced data set is not considered ideal.The majority instances are clusters using kernel fuzzy C-Means clustering algorithm in kernel space(KFCM) for randomly resampling representative samples with cluster information,which can not only reduce the number of majority instances,but also make the SVM classification interface biased toward the majority instances.The AdaBoost was used to integrate the proposed unbalanced classification component based on kernel cluster under-sampling,which revealed the SVM generalization performance under unbalanced dataset improved.The proposed approach was compared with other data-preprocess ensemble methods for unbalanced dataset and the experimental results demonstrated that the proposed method can improve classification performance of SVM in the minority class of the unbalanced data and increase the overall classification performance and efficiency.
关 键 词:失衡数据 SVM算法 ADABOOST 核聚类 欠取样
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222