检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹家庆 吴观茂[1] Cao Jiaqing;Wu Guanmao(School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan 232001,China)
机构地区:[1]安徽理工大学计算机科学与工程学院,安徽淮南232001
出 处:《信息技术与网络安全》2018年第5期84-87,92,共5页Information Technology and Network Security
基 金:国家自然科学基金(61471004);安徽理工大学研究生创新基金项目(2017CX2045)
摘 要:针对一种贪心EM算法在处理大规模数据集时收敛速度急剧减慢的问题,提出了一种基于MapReduce的贪心EM算法。该算法首先通过Map(映射)实现数据分发,对每个节点进行处理并生成相应的键值对,然后利用Reduce(归约)将生成的键值对进行整合,最终通过获取最优的高斯混合模型,进而得到模型成分数。通过与传统EM算法、贪心EM算法的运算结果进行比较,实验结果证明该算法在保证准确获取高斯混合模型的模型成分数的前提下,明显地提高了收敛速度。For the problem that the convergence rate of the existing greedy EM algorithm is drastically slowing down when dealing with largescale data set. In this paper,a greedy EM algorithm based on MapReduce is proposed based on the original greedy EM algorithm. Firstly,the data distribution is carried out through Map( mapping) and each node is processed to generate the corresponding key-value pairs. Then,the key-value of the integration is generated through Reduce( reduction). Finally,the number of model components is got by obtaining the optimal Gaussian mixture model. Compared with the traditional EM algorithm and the greedy EM algorithm,the experimental results show that the algorithm can greatly improve the convergence speed on the basis of ensuring the accurate acquisition of the model component of the Gaussian mixture model.
关 键 词:贪心EM算法 机器学习 数据挖掘 MAPREDUCE框架
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117