基于随机森林算法的网络流量分类方法  被引量:9

Research on Classification of Network Traffic Based on Random Forests Algorithm

在线阅读下载全文

作  者:赵小欢[1] 夏靖波[1] 李明辉[2] 

机构地区:[1]空军工程大学信息与导航学院,西安710071 [2]空军后勤部,北京100720

出  处:《中国电子科学研究院学报》2013年第2期184-190,共7页Journal of China Academy of Electronics and Information Technology

基  金:陕西省自然科学基础研究计划重点项目(2012JZ8005);全军军事学研究生课题项目(2010XXXX-488;2011XXX-X23)

摘  要:精确的网络流量分类是实现互联网可控可管的关键,传统的单一分类算法需要构建基于特定假设的某种模型,算法对于待分类数据的分布要求高,不能满足复杂多变的网络流量的分类要求。基于此,采用多决策树组合的随机森林算法实现网络流量分类。通过实际网络流量数据实验表明,在各种情况下,随机森林算法都能显著改善网络流量特别是小比例样本的分类效果,算法降低了单一算法过于依赖特定假设模型的要求,对于待分类样本的分布要求低,随机森林算法具有良好的分类效果和鲁棒性。The accurate classification of network traffic is the key of implementing the function of the controllable and manageability of Internet, which is important for network management and network security. The common single classification algorithms usually need to abide by specific hypothesis for modeling and have strict restriction on the distribution of datasets to be processed. Thus, these algorithms can't satisfy the need of classification of network traffic with the property of muhifractal and burst at all time. On the basis, the Random Forests (RF) algorithm which combines with multiple decision trees is used to classify traffic. The experiments on network traffic show that RF is capable of improving the classification effect of traffic especially for small flows whose ratio is small in the datasets and loosening the restriction of single classifier which abides by specific hypothesis obviously. Meanwhile, RF has a weaker restriction on the distribution of network traffic and it performs perfectly on the classification of network traffic and has stronger adaptability and robustness than any single classifier.

关 键 词:流量分类 流量特征选择 组合分类器 随机森林算法 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象