检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]南京邮电大学教育科学与技术学院,江苏南京210003 [2]南京邮电大学计算机学院,江苏南京210003
出 处:《计算机技术与发展》2015年第6期87-91,共5页Computer Technology and Development
基 金:江苏省自然科学基金项目(BK20130882)
摘 要:云计算环境中,传统的基于MapReduce的SVM分类算法对数据集的训练是将各子节点训练后得到的支持向量进行合并,得到的分类器分类效率和准确率不理想。为此,文中提出了一种改进的训练算法,在各节点上运用遗传算法来寻找子数据集的最优核函数及参数,用得到的参数组合对子数据集进行训练得到支持向量,合并每个节点训练后的支持向量为全局支持向量,然后在各个节点上将子集与全局支持向量合并作为新的训练数据集。重复这四个步骤,直到全局支持向量不再变化时,则收敛到最优分类模型。最后,经开源云计算平台Hadoop实验验证,该算法的分类正确率比传统的分类算法有了明显提高。In cloud computing environment,the method adopted by the traditional SVM sorting algorithms based on MapReduce of train-ing data set is too simple and it just merges support vectors after nodes’ training,so the efficiency and accuracy of classifier are not very ideal. To solve the problem above,an improved training algorithm is proposed in this paper. Firstly,use the genetic algorithm to get the optimal kernel function and parameters on each node at the same time,then using the combination to train the data set for support vector, and afterwards,combining all support vectors after training as a global support vector,and then merging every data subset with global support vector on each node to get a new training data set. Repeat these four steps until the global support vector no longer changes and that’ s to say,it converges to the optimal classification model. Finally,the experiment on Hadoop proves that the classification accuracy of new algorithm is improved obviously than traditional classification algorithms.
关 键 词:MAPREDUCE SVM分类算法 遗传算法 云计算
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.194