检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李硕 梁毅 LI Shuo;LIANG Yi(Faculty of Information,Beijing University of Technology,Beijing 100124,China)
出 处:《计算机工程与应用》2021年第5期79-87,共9页Computer Engineering and Applications
基 金:国家重点研发计划(2017YFC0803300);国家自然科学基金面上项目(91546111)。
摘 要:Spark批处理应用执行时间预测是指导Spark系统资源分配、应用均衡的关键技术。然而,既有研究对于具有不同运行特征的应用采用统一的预测模型,且预测模型考虑因素较少,降低了预测的准确度。针对上述问题,提出了一种考虑了应用特征差异的Spark批处理应用执行时间预测模型,该模型基于强相关指标对Spark批处理应用执行时间进行分类,对于每一类应用,采用PCA和GBDT算法进行应用执行时间预测。当即席应用到达后,通过判断其所属应用类别并采用相应的预测模型进行执行时间预测。实验结果表明,与采用统一预测模型相比,提出的方法可使得预测结果的均方根误差和平均绝对百分误差平均降低32.1%和33.9%。The prediction of execution time for batch application in Spark is the key technology to guide the resource allocation and application balance of Spark.However,the existing work adopts an unified prediction model for application with different behavior characteristics and considers limited factors in the model learning,which reduces the accuracy of prediction.In order to solve the above problems,an execution time prediction model for Spark batch application is proposed,which considers the diversity of batch application’s behavior characteristics.The model first classifies the execution time of Spark batch application based on strong-correlated metrics,and then uses PCA and GBDT algorithms to predict the execution time for each application category.Finally,when the ad-hoc application arrives,it is mapped into a specific application category and its execution time is predicted with the corresponding prediction model.The experimental results show that,compared with the unified prediction model,the proposed method can reduce the mean square root error and the mean absolute percentage error of the prediction results by 32.1%and 33.9%on average.
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.112