基于iForest-BiLSTM-Attention的数据库负载预测方法  被引量:5

Database Workload Prediction Method Based on iForest-BiLSTM-Attention

在线阅读下载全文

作  者:姬莉霞[1,2] 赵耀 马郑祎[1] 赵润哲 张晗 JI Lixia;ZHAO Yao;MA Zhengyi;ZHAO Runzhe;ZHANG Han(School of Cyber Science and Engineering,Zhengzhou University,Zhengzhou 450002,China;Zhengzhou Key Laboratory of Blockchain and Data Intelligence,Zhengzhou 450002,China)

机构地区:[1]郑州大学网络空间安全学院,河南郑州450002 [2]郑州市区块链与数据智能重点实验室,河南郑州450002

出  处:《郑州大学学报(理学版)》2022年第6期66-73,共8页Journal of Zhengzhou University:Natural Science Edition

基  金:国家自然科学基金项目(52179144);河南省重大科技专项(201300210500);郑州市重大科技创新专项(2020CXZX0053)。

摘  要:针对数据库负载预测中物理资源的变化导致预测失效,模型易对异常数据敏感和未关注序列变化中潜在的加权隐层特征状态导致预测精度低等问题,在长短期记忆网络模型的基础上提出一种基于iForest-BiLSTM-Attention的数据库负载预测方法。首先,增加数据库基准规范内部指标,解决因物理资源改变而导致的传统指标预测失效问题;其次,建立多个孤立树,整合为孤立森林,评估样本异常分数并筛出异常数据进行热卡填充;最后,结合注意力机制与双向长短期记忆网络计算隐层状态以及注意力权值,并学习工作负载的形态、周期以及规律性。实验结果表明,所提方法在数据库工作负载预测精度上相比现有方法有显著提升,吞吐量和CPU利用率的R 2值分别达到0.93和0.95。Aiming to deal with prediction failure caused by the change of physical resources in database workload prediction,the model′s susceptibility to abnormal data and failure to pay attention to the potential weighted hidden layer feature states in sequence changes leading to low prediction accuracy,a database workload prediction method based on iForest-BiLSTM-Attention was proposed on the basis of long short term memory network model.Firstly,the internal indicators of database benchmark specification were added to solve the problem of traditional indicator prediction failure caused by physical resource change.Then,multiple isolation trees were created to be integrated into isolation forests,abnormal scores of sample were evaluated and abnormal data was screened out for hot deck imputation.Finally,combined with the attention mechanism and bi-directional long short term memory network,the hidden layer states and attention weights were calculated to learn the shape,period and regularity of workload.The experimental results showed that the proposed method significantly improved the database workload prediction accuracy compared with the existing methods,and the R 2 values of throughput and CPU utilization achieved 0.93 and 0.95 respectively.

关 键 词:数据库负载预测 双向长短期记忆网络 注意力机制 孤立森林 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象