检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李慧芳[1] 黄姜杭 徐光浩 夏元清[1] LI Hui-Fang;HUANG Jiang-Hang;XU Guang-Hao;XIA Yuan-Qing(Key Laboratory of Intelligent Control and Decision of Complex Systems,Beijing Institute of Technology,Beijing 100081)
机构地区:[1]北京理工大学复杂系统智能控制与决策国家重点实验室,北京100081
出 处:《自动化学报》2023年第1期67-78,共12页Acta Automatica Sinica
基 金:国家重点研发计划(2018YFB1003700);国家自然科学基金(61836001)资助。
摘 要:任务执行时间估计是云数据中心环境下工作流调度的前提.针对现有工作流任务执行时间预测方法缺乏类别型和数值型数据特征的有效提取问题,提出了基于多维度特征融合的预测方法.首先,通过构建具有注意力机制的堆叠残差循环网络,将类别型数据从高维稀疏的特征空间映射到低维稠密的特征空间,以增强类别型数据的解析能力,有效提取类别型特征;其次,采用极限梯度提升算法对数值型数据进行离散化编码,通过对稠密空间的输入向量进行稀疏化处理,提高了数值型特征的非线性表达能力;在此基础上,设计多维异质特征融合策略,将所提取的类别型、数值型特征与样本的原始输入特征进行融合,建立基于多维融合特征的预测模型,实现了云工作流任务执行时间的精准预测;最后,在真实云数据中心集群数据集上进行了仿真实验.实验结果表明,相对于已有的基准算法,该方法具有较高的预测精度,可用于大数据驱动的云工作流任务执行时间预测.Task runtime estimation is a prerequisite for workflow scheduling in cloud data centers.However,the existing runtime prediction methods for workflow activities fail to effectively extract categorical and numerical features.In this paper,we propose a multi-dimensional feature fusion-based runtime prediction approach for workflow tasks.Firstly,we construct a stacked residual recurrent neural network with an attention mechanism for mapping categorical data from high-dimensional sparse space to low-dimensional dense space so as to enlarge its capability of parsing categorical data for categorical feature extraction.Secondly,extreme gradient boosting is introduced to discretize the numerical data and enhance the nonlinear representation capability for numerical features through sparsely processing the input vectors within dense space.Thirdly,we design a heterogeneous multi-dimensional feature fusion strategy,and then blend the extracted features with original inputs to mine comprehensive knowledge for runtime prediction.Finally,based on the resulting multi-dimensional fused features,a prediction model is developed to fully utilize these features as well as its corresponding hidden knowledge and then to forecast the runtimes accurately for cloud workflow tasks.To verify the effectiveness and superiority of the proposed method,we conduct extensive experiments on a cluster dataset from a real cloud data center.The experimental results show that,our approach outperforms the existing algorithms and can be applied in big data-driven runtime prediction for workflow activities in the cloud.
关 键 词:云数据中心 工作流 集成学习 特征融合 执行时间预测
分 类 号:TP393.09[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.79