基于近邻回归的Spark性能优化方法  

Optimization Method of Spark Performance Based on Nearest Neighbor Regression

在线阅读下载全文

作  者:张威[1] ZHANG Wei(Hubei University of Traditional Chinese Medicine,Wuhan 430065,China)

机构地区:[1]湖北中医药大学,湖北武汉430065

出  处:《电视技术》2022年第9期47-50,共4页Video Engineering

基  金:湖北省教育厅科学技术研究计划指导性项目“基于k近邻回归的Spark性能配置优化方法的研究”(No.B2018103)。

摘  要:Spark是一种基于内存的分布式计算模型,相较于Hadoop的MapReduce模型有非常大的性能提升,因此Spark模型广泛应用于大数据处理工作中。基于其应用的广泛性,如何提升Spark的性能,成为一个焦点问题。现阶段,最为常用的一类优化方式就是通过机器学习构建配置参数-性能模型,再通过智能算法求解性能模型获得最优配置解的方式。但是Spark在工作过程中受到多方面的影响,易造成样本的观察结果产生波动。这种波动会对模型的性能产生负面影响。对此,提出一种基于近邻回归的方法构建Spark性能模型,通过近邻的注意力机制降低样本观测波动的影响,提升模型质量,从而更好地提升Spark的性能。Spark is a memory based distributed computing model. Compared with Hadoop’s MapReduce model, Spark has a great performance improvement. Therefore, Spark model is widely used in big data processing. Based on its extensive application, how to improve the performance of Spark has become a focus issue. At present, the most commonly used optimization method is to build a configuration parameter performance model through machine learning, and then solve the performance model through intelligent algorithm to obtain the optimal configuration solution. However, Spark is affected by many aspects during its work, which is easy to cause fluctuations in the observation results of samples. This fluctuation will have a negative impact on the performance of the model. To this end, a method based on nearest neighbor regression is proposed to build spark performance model. Through the attention mechanism of nearest neighbors, the influence of sample observation fluctuation is reduced, and the model quality is improved, so as to better improve Spark performance.

关 键 词:近邻回归 性能优化 SPARK 

分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象