基于MapReduce的支持向量机参数选择研究  

Optimal Parameters Selection of Support Vector Machine Based on MapReduce Framework

在线阅读下载全文

作  者:刘黎志[1] 杨敏 LIU Lizhi;YANG Min(Hubei Key Laboratory of Intelligent Robot(Wuhan Institute of Technology),Wuhan 430205,China)

机构地区:[1]智能机器人湖北省重点实验室(武汉工程大学),湖北武汉430205

出  处:《武汉工程大学学报》2022年第1期85-91,共7页Journal of Wuhan Institute of Technology

基  金:2017年度湖北省教育厅科学研究计划指导性项目(B2017051)。

摘  要:针对在分布式Hadoop集群环境下对支持向量机进行最优分类模型参数选择的问题,提出一种基于MapReduce框架的最优分类模型参数选择算法。该算法能以串行或单个MapReduce作业这两种方式完成最优模型参数的选择,在Map阶段读取存储在Hadoop分布式文件系统中的参数文件,并为每组参数生成具有不同键值的中间结果,以保证在Reduce阶段,每个并行执行的任务仅对一组参数进行交叉验证。实验结果表明,在集群内存资源合理消耗的前提下,为粗粒度最优参数搜索设置适当的Reduce数量,单个MapReduce作业方式相比于串行MapReduce作业方式算法运行效率至少提升了1.7倍,显著减少最优模型参数的获取时间。Aiming at the problem of parameter selection for optimal classification model of Support Vector Machine in a distributed cluster of Hadoop,a parameter selection algorithm for optimal classification model based on MapReduce was proposed.The algorithm can complete the selection of optimal parameters in two modes:serial and single MapReduce jobs.In the Map stage,the parameter files stored in Hadoop Distributed File System was read,and intermediate results with different key values were generated for each set of parameters to ensure that each parallel executed task only performed cross-validation on one set of parameters in the Reduce stage.The experimental results show that on the premise of reasonable consumption of cluster memory resources,setting the appropriate number of Reduce for coarse-grained optimal parameter search,the operation efficiency of single MapReduce job mode is improved by at least 1.7 times compared with serial MapReduce job mode,and the acquisition time of optimal model parameters is significantly reduced.

关 键 词:MAPREDUCE 支持向量机分类 交叉验证 参数选择 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象