基于反向K近邻和密度峰值初始化的加权Kmeans聚类入侵检测算法  被引量:7

Intrusion detection algorithm based on weighted Kmeans clustering with reverse K-nearest neighbor and density peak initialization

在线阅读下载全文

作  者:张喜梅[1] 解滨[1,2,3] 徐童童 张春昊 Zhang Ximei;Xie Bin;Xu Tongtong;Zhang Chunhao(College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang 050024,China;Hebei Provincial Key Laboratory of Network and Information Security,Hebei Normal University,Shijiazhuang 050024,China;Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics and Data Security,Hebei Normal University,Shijiazhuang 050024,China)

机构地区:[1]河北师范大学计算机与网络空间安全学院,河北石家庄050024 [2]河北师范大学河北省网络与信息安全重点实验室,河北石家庄050024 [3]河北师范大学供应链大数据分析与数据安全河北省工程研究中心,河北石家庄050024

出  处:《南京理工大学学报》2023年第1期56-65,共10页Journal of Nanjing University of Science and Technology

基  金:国家自然科学基金(62076088);河北师范大学技术创新基金项目(L2020K09)。

摘  要:传统Kmeans聚类算法的性能易受初始类簇中心随机性和类簇中心计算的迭代过程中边缘点和离群点反复计入的影响,为了避免这些影响,该文提出一种基于反向K近邻和密度峰值初始化的加权Kmeans聚类算法。通过样本的近邻信息计算每个样本的反向K近邻,针对不同规模、不同密度分布数据集,可以自适应地搜索密度峰值点作为初始类簇中心;自适应设定相对簇半径,并通过样本加权进行类簇中心迭代,在不同数据分布下可以有效降低边缘点和离群点对聚类结果的影响。试验结果证明,该算法在聚类性能提升的同时迭代次数大幅降低,随着入侵行为类型和数据规模的增加,该文聚类算法仍体现出较好的性能,且在发现未知攻击类型上效果显著。In order to avoid the performance of the traditional Kmeans clustering algorithm being easily affected by the randomness of the initial cluster centers and the repeated counting of edge points and outliers in the iterative process of cluster centers calculation,this paper proposes a weighted Kmeans clustering algorithm based on reverse K-nearest neighbor and density peak initialization.First,the reverse K-nearest neighbors of each sample are calculated by the neighbor information of the sample.For different scales and different density distributions datasets,the density peak point can be adaptively searched as the initial cluster centers.Then,the relative cluster radius is set adaptively,and cluster centers iteration is performed through sample weighting,which can effectively reduce the influence of edge points and outliers on clustering results under different data distributions.The experimental results show that while the clustering performance is improved,the number of iterations of the algorithm is greatly reduced,and with the increase of intrusion behavior types and data scale,the clustering algorithm proposed in this paper has better performance,and it has a significant effect in discovering unknown attack types.

关 键 词:Kmeans聚类 入侵检测 密度峰值 样本加权 反向K近邻 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象