基于MapReduce的健康大数据并行挖掘算法研究  

Research on MapReduce⁃based parallel mining algorithm for health big data

在线阅读下载全文

作  者:陈榆 何慧敏 梁志胜 欧旭 CHEN Yu;HE Huimin;LIANG Zhisheng;OU Xu(Information Center,Guangxi Medical University,Nanning 530021,China)

机构地区:[1]广西医科大学信息中心,广西南宁530021

出  处:《现代电子技术》2023年第12期79-83,共5页Modern Electronics Technique

基  金:2018年广西高等教育本科教学改革工程项目(2018JGA144)。

摘  要:随着信息技术的发展,健康大数据呈指数级别剧增,但数据量过大使得较多有价值的数据被埋没,医疗服务的质量与效率难以提升。为解决上述问题,文中提出一种基于MapReduce的健康大数据并行挖掘算法。首先对健康大数据进行预处理,消除一些不利因素对数据的影响;再以预处理后的健康大数据为依据,获取初始簇中心,度量健康大数据与簇中心之间的距离,聚类处理健康大数据;最后,应用MapReduce制定健康大数据并行挖掘程序,执行制定程序即可完成健康大数据的并行挖掘。实验结果表明,所提算法的健康大数据挖掘效率最大值为94 GB/s,加速比最大值为4.5,相比于其他方法,该算法对健康大数据挖掘的性能更佳。With the development of information technology,big data on health is growing exponentially,but more valuable data has been buried because of the excessive amount of data,making the quality and efficiency of medical services difficult to be improved.In order to solve the above problems,a MapReduce⁃based parallel mining algorithm for healthy big data is proposed.The health big data is preprocessed to eliminate the impact of some unfavorable factors on the data.According to the health big data after preprocessing,the initial cluster center is obtained,the distance between the health big data and the cluster center is measured,and the health big data is clustered.MapReduce is used to develop a parallel mining program for health big data,and the parallel mining of health big data can be completed by executing the developed program.The experimental results show that the maximum efficiency of the proposed algorithm for health big data mining is 94 GB/s,and the maximum acceleration ratio is 4.5.In comparison with other methods,the algorithm can perform better in health big data mining.

关 键 词:健康大数据 并行挖掘算法 MAPREDUCE 数据预处理 数据聚类 挖掘程序 

分 类 号:TN919-34[电子电信—通信与信息系统] TP393[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象