Efficient Feature Extraction Using Apache Spark for Network Behavior Anomaly Detection  被引量:2

Efficient Feature Extraction Using Apache Spark for Network Behavior Anomaly Detection

在线阅读下载全文

作  者:Xiaoming Ye Xingshu Chen Dunhu Liu Wenxian Wang Li Yang Gang Liang Guolin Shao 

机构地区:[1]School of Cybersecurity, Chengdu University of Information Technology, Chengdu 610225, and the College of Computer Science, Sichuan University, Chengdu 610065, China [2]College of Cybersecurity, Sichuan University, Chengdu 610065, China [3]School of Management, Chengdu University of Information Technology, Chengdu 610103, China [4]College of Compute Science, Sichuan University, Chengdu 610065, China

出  处:《Tsinghua Science and Technology》2018年第5期561-573,共13页清华大学学报(自然科学版(英文版)

基  金:supported by the National Natural Science Foundation of China (No. 61272447);Sichuan Province Science and Technology Planning (Nos. 2016GZ0042, 16ZHSF0483, and 2017GZ0168);Key Research Project of Sichuan Provincial Department of Education (Nos. 17ZA0238 and 17ZA0200);Scientific Research Staring Foundation for Young Teachers of Sichuan University (No. 2015SCU11079)

摘  要:Extracting and analyzing network traffic feature is fundamental in the design and implementation of network behavior anomaly detection methods. The traditional network traffic feature method focuses on the statistical features of traffic volume. However, this approach is not sufficient to reflect the communication pattern features. A different approach is required to detect anomalous behaviors that do not exhibit traffic volume changes, such as low-intensity anomalous behaviors caused by Denial of Service/Distributed Denial of Service (DoS/DDoS) attacks, Internet worms and scanning, and BotNets. We propose an efficient traffic feature extraction architecture based on our proposed approach, which combines the benefit of traffic volume features and network communication pattern features. This method can detect low-intensity anomalous network behaviors and conventional traffic volume anomalies. We implemented our approach on Spark Streaming and validated our feature set using labelled real-world dataset collected from the Sichuan University campus network. Our results demonstrate that the traffic feature extraction approach is efficient in detecting both traffic variations and communication structure changes. Based on our evaluation of the MIT-DRAPA dataset, the same detection approach utilizes traffic volume features with detection precision of 82.3% and communication pattern features with detection precision of 89.9%. Our proposed feature set improves precision by 94%.Extracting and analyzing network traffic feature is fundamental in the design and implementation of network behavior anomaly detection methods. The traditional network traffic feature method focuses on the statistical features of traffic volume. However, this approach is not sufficient to reflect the communication pattern features. A different approach is required to detect anomalous behaviors that do not exhibit traffic volume changes, such as low-intensity anomalous behaviors caused by Denial of Service/Distributed Denial of Service (DoS/DDoS) attacks, Internet worms and scanning, and BotNets. We propose an efficient traffic feature extraction architecture based on our proposed approach, which combines the benefit of traffic volume features and network communication pattern features. This method can detect low-intensity anomalous network behaviors and conventional traffic volume anomalies. We implemented our approach on Spark Streaming and validated our feature set using labelled real-world dataset collected from the Sichuan University campus network. Our results demonstrate that the traffic feature extraction approach is efficient in detecting both traffic variations and communication structure changes. Based on our evaluation of the MIT-DRAPA dataset, the same detection approach utilizes traffic volume features with detection precision of 82.3% and communication pattern features with detection precision of 89.9%. Our proposed feature set improves precision by 94%.

关 键 词:feature extraction graph theory network behavior anomaly detection Apache Spark 

分 类 号:TP393.08[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象