检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡健[1,2] 王祥太 毛伊敏 刘蔚[2] Hu Jian;Wang Xiangtai;Mao Yimin;Liu Wei(School of Information Engineering,Jiangxi University of Science&Technology,Ganzhou Jiangxi 341000,China;Dept.of Information Engineering,Gannan University of Science&Technology,Ganzhou Jiangxi 341000,China)
机构地区:[1]江西理工大学信息工程学院,江西赣州341000 [2]赣南科技学院电子信息工程学院,江西赣州341000
出 处:《计算机应用研究》2022年第2期447-455,共9页Application Research of Computers
基 金:国家自然科学基金资助项目(41562019);国家重点研发计划资助项目(2018YFC1504705);江西省教育厅科技资助项目(GJJ209407,GJJ209405)。
摘 要:针对大数据环境下并行支持向量机(SVM)算法存在冗余数据敏感、参数选取困难、并行化效率低等问题,提出了一种基于Relief和BFO算法的并行SVM算法RBFO-PSVM。首先,基于互信息和Relief算法设计了一种特征权值计算策略MI-Relief,剔除数据集中的冗余特征,有效地降低了冗余数据对并行SVM分类的干扰;接着,提出了基于MapReduce的MR-HBFO算法,并行选取SVM的最优参数,提高SVM的参数寻优能力;最后,提出核聚类策略KCS,减小参与并行化训练的数据集规模,并提出改进CSVM反馈机制的交叉融合级联式并行支持向量机CFCPSVM,结合MapReduce编程框架并行训练SVM,提高了并行SVM的并行化效率。实验表明,RBFO-PSVM算法对大型数据集的分类效果更佳,更适用于大数据环境。Aiming at the problems of parallel support vector machine(SVM)algorithm in big data environment such as redundant data sensitivity,difficulty in parameter selection,and low parallelization efficiency,this paper proposed a parallel SVM algorithm using Relief and bacterial foraging optimization(BFO)algorithm based on MapReduce(RBFO-PSVM).Firstly,the algorithm designed a feature weight calculation strategy(MI-Relief),which used mutual information to improve the weight calculation function of Relief algorithm to eliminate redundant features in the data set and effectively reduce redundant data to support parallelism.Secondly,this paper proposed a hybrid BFO algorithm based on MapReduce(MR-HBFO),which selected the optimal parameters of SVM in parallel,and solved the problem of difficult selection of SVM parameters.Finally,it proposed the kernel clustering strategy(KCS)to reduce the size of the data set involved in parallel training,and proposed a cross-fusion cascaded parallel SVM(CFCPSVM)to improve the cascade SVM(CSVM)feedback mechanism.It trained SVM by combining with the MapReduce programming framework,and this improved the parallelization efficiency of parallel SVM.Experiments show that the RBFO-PSVM algorithm has a better classification effect on large data sets and is more suitable for large data environments.
关 键 词:SVM算法 MapReduce CFCPSVM模型 MI-Relief策略 MR-HBFO算法
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.141.38.5