基于XGBoost特征筛选的工业时序数据的重建异常检测算法研究  被引量:1

Research on Reconstruction Anomaly Detection Algorithm of Industrial Time Series Data Based on XGBoost Feature Selection

在线阅读下载全文

作  者:周旭荣 郑建立[1] 

机构地区:[1]东华大学信息科学与技术学院,上海

出  处:《计算机科学与应用》2022年第3期590-601,共12页Computer Science and Application

摘  要:针对工业生产中产生的大量时序数据,如何对无用数据进行有效剔除,并且判断传感器所采集数据是否正确,如何对时序数据进行有效异常检测,成为了研究者们关注的问题。在此期间,很多研究者都提出了自己的异常检测算法,但大多只考虑了时序数据的时间性特征,并未将传感器之间的相关性特征考虑进去。所以本文提出一种基于XGBoost特征筛选的多维自注意卷积门控循环编码解码器(MDACGA),对原始的数据集进行有效特征筛选,根据得分,剔除无关变量,提取有效变量。之后利用有效信息构建特征矩阵,采用全卷积编码器来对特征矩阵进行编码,提取不同时间序列间的相关性特征,采用基于注意力机制的ConvGRU来提取不同时间序列间的时间性特征。最后利用卷积解码器对前一步得到的特征矩阵进行联合解码,从而得到重建后的特征矩阵,利用Adam优化器和小批量随机梯度下降法来最小化重建误差。最终利用残差特征矩阵进行异常检测。实验结果显示,该算法达到0.989的准确率、0.996的召回率,足以表明该异常检测算法具有有效性,并且异常检测效果也优于一般基准算法。In view of the large amount of time series data generated in industrial production, how to effectively eliminate useless data, judge whether the data collected by sensors is correct, and how to effectively detect anomalies of time series data have become the focus of researchers. In this period, many researchers have proposed their own anomaly detection algorithm, but most of them only consider the temporal characteristics of time series data, and do not take into account the correlation between sensors. So this paper proposes a Multi-Dimensional Self-Attention Convolutional Gated Re-current Encoder and Decoder (MDACGA) based on XGBoost for feature selection, which can effectively filter the original data set and eliminate irrelevant variables according to the score, extraction of valid variables. Then, the effective information is used to construct the feature matrix, and the full convolution encoder is used to encode the feature matrix and extract the correlation features of different time series. ConvGRU-Attention mechanism is used to extract temporal features of different time series. Finally, a convolution decoder is used to jointly decode the feature matrix obtained in the previous step to get the reconstructed feature matrix, and Adam Optimizer and Mini-Batch Stochastic Gradient Descent are used to minimize the reconstruction error. Finally, anomaly detection is carried out by residual error characteristic matrix. The experimental results show that the accuracy of the algorithm is 0.989 and the recall of the algorithm is 0.996, which shows that the anomaly detection algorithm is effective and the anomaly detection effect is better than the general benchmark algorithm.

关 键 词:时序数据 XGBoost 卷积编码器 解码器 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP393.08[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象