基于滑动时间窗的物联网设备流量分类算法  被引量:2

Traffic Classification Algorithm for IoT Device Based on Sliding Time Window

在线阅读下载全文

作  者:余长宏[1] 陆雅 王海鑫 高明[1] YU Changhong;LU Ya;WANG Haixin;GAO Ming(School of Information and Electronic Engineering,Zhejiang Gongshang University,Hangzhou 310000,China)

机构地区:[1]浙江工商大学信息与电子工程学院,杭州310000

出  处:《计算机工程》2023年第7期259-268,共10页Computer Engineering

基  金:国家自然科学基金(61871468);浙江省重点研发计划(2017C01G2050953)。

摘  要:现有的物联网设备流量分类方案多依赖完整的流或流的前几个数据包。依赖完整的流会使流量数据增多,从而增加计算复杂度与存储资源的消耗,但物联网设备的存储空间与CPU性能都十分有限;而依赖流的前几个数据包,若其部分数据包丢失就会导致分类效果变差。针对上述问题,提出一种基于滑动时间窗口的随机森林物联网设备流量分类算法,利用物联网流量信息来表征各种设备的属性。首先,基于物联网设备流量时间依赖性的特点,利用滑动时间窗口将流划分为多个时间周期为T的子流;然后,基于物联网设备流量的加密特性,从子流中提取流信息与流头部的数据包信息建立特征向量;最后,基于随机森林随机抽样和随机选特征的特性构建分类模型,以增强模型的泛化能力,进一步提高分类性能。在公开数据集UNSW上的实验结果表明,该算法的分类准确率为96.23%、精确率为94.8%、召回率为91.47%、F1值为93%,具有较好的分类效果。Existing traffic classification schemes for Internet of Things(IoT)devices rely mostly on the complete flow or the first few packets of the flow.If the scheme relies on the complete flow,this will lead to more data,thus increasing the computing complexity and storage resource consumption,however,the storage space and CPU performance of IoT devices are very limited;if the scheme relies on the first few packets of the flow,if some of the first several packets that depend on the flow are lost,the classification effect is poor.To solve these problems,this paper proposes a random forest traffic classification algorithm for IoT devices based on sliding time window.This algorithm uses IoT traffic information to characterize the attributes of various devices.First,based on the time-dependent characteristics of the flow of IoT devices,the flow is divided into several sub-flows with a period of T using a sliding time window.Second,based on the encryption characteristics of the IoT device traffic,the flow and packet information of the flow head are extracted from the sub-flow to establish the feature vector.Finally,a classification model is constructed based on the characteristics of random sampling and randomly selected features of the random forest to enhance the generalization ability of the model and further improve the classification performance.The experimental results on the public dataset UNSW show that the classification accuracy,precision,recall rate,and F1 value are 96.23%,94.8%,91.47%,and 93%,respectively,indicating good classification accuracy.

关 键 词:物联网 流量分类 网络安全 随机森林 设备管理 服务质量 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象