面向大数据的SVM参数寻优方法  被引量:12

A Novel Parameters Optimization of SVM for Large Data Sets

在线阅读下载全文

作  者:龚永罡[1] 汤世平[2] 

机构地区:[1]北京工商大学信息工程学院,北京100037 [2]北京理工大学计算机学院,北京100081

出  处:《计算机仿真》2010年第9期204-207,共4页Computer Simulation

摘  要:研究数据回归问题,进行快速寻优,传统SVM参数寻优因采用大范围遍历搜索算法,需消耗大量时间,不适用于对大数据集进行训练。基于均匀设计与自调用支持向量回归,为缩短寻优时间,加快速度,提出了一种有效降低搜索时间的策略。根据均匀设计产生27个具有代表性参数组合,每个组合对训练集经交叉测试得其均方误差MSE,再以MSE为目标函数,通过自调用支持向量回归建立其与27个参数组合之间的关系模型。基于关系模型预测729个参数组合对应的MSE,并以MSE最小寻找最优参数组合。3个实例数据集的仿真结果表明,新方法在保证预测精度的同时,大幅度缩短了训练建模时间,为大数据集支持向量机参数选择提供了新的有效解决方案。Traditional Support Vector Machine (SVM) is computationally infeasible for very large data sets because of the parameters optimization that adopts the ergodic search algorithm in a wide range of parameters and will take a lot of time consuming. A novel strategy that can reduce the time effectively was proposed based on uniform design (UD) and support vector regression (SVR). Firstly,the domain with 27 parameter combinations is abstracted from the large-scale samples via a 3 factors and 9 levels mixed uniform design table. Then the 27 MSEs are obtained by training these parameter combinations with SVR. Secondly,a new training dataset including the 27 MSEs and parameter combinations is trained and used to predict the all 729 parameter combinations searching domains with SVR by using leave-one-out method. The best parameters combinations are found based on the least MSE. Lastly,the large data set is trained and predicted by the best parameter combination. Experiments on 3 benchmark datasets illustrate that the new method can not only assure the prediction precision but also reduce training time markedly. The new method is an efficient solution to large data sets model selection for Support Vector Machine (SVM).

关 键 词:均匀设计 支持向量回归 大数据 参数 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象