用于处理不平衡样本的改进近似支持向量机新算法  被引量:6

Modified proximal support vector machine algorithm for dealing with unbalanced samples

在线阅读下载全文

作  者:刘艳[1,2] 钟萍[1] 陈静[1] 宋晓华[2] 何云[2] 

机构地区:[1]中国农业大学理学院.北京100083 [2]燕京理工学院机电学院,河北廊坊065201

出  处:《计算机应用》2014年第6期1618-1621,共4页journal of Computer Applications

基  金:国家自然科学基金资助项目(11271367,11171346)

摘  要:近似支持向量机(PSVM)在处理不平衡样本时,会过拟合样本点数较多的一类,低估样本点数较少的类的错分误差,从而导致整体样本的分类准确率下降。针对该问题,提出一种用于处理不平衡样本的改进的PSVM新算法。新算法不仅给正、负类样本赋予不同的惩罚因子,而且在约束条件中新增参数,使得分类面更具灵活性。该算法先对训练集训练获得最优参数,然后再对测试集进行训练获得分类超平面,最后输出分类结果。UCI数据库中9组数据集的实验结果表明:新算法提高了样本的分类准确率,在线性的情况下平均提高了2.19个百分点,在非线性的情况下平均提高了3.14个百分点,有效地提高了模型的泛化能力。When Proximal Support Vector Machine (PSVM) deals with unbalanced samples, it will overfit the class with large samples and underestimate the misclassification error of the class with small samples, resulting in the decline of accuracy in overall samples. To solve this problem, a modified PSVM used for dealing with unbalanced samples was proposed. The new algorithm not only set different punishments for positive and negative samples, but also added a new parameter to the constraint, making the classification hyperplane more flexible. Firstly, the new algorithm trained the training set to obtain the optimal parameters, then the classification hyperplane was obtained by training the test set. Finally, the classification results was output. The experiments presented by 9 datasets in UCI database show that the new algorithm improves the classification accuracy of the samples, by 2.19 and 3.14 percentage points in the linear and nonlinear case respectively. The generalization ability of the algorithm is strengthened effectively.

关 键 词:近似支持向量机 不平衡样本 参数 惩罚因子 模型改进 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象