非结构化高维大数据异常流量时间点挖掘算法  

Mining Algorithm for Abnormal Traffic Time Points for Unstructured High-Dimensional Big Data

在线阅读下载全文

作  者:解海燕[1] 李杰[1] 赵国栋 XIE Hai-yan;LI Jie;ZHAO Guo-dong(Yinchuan University of Energy,Yinchuan Ningxia 750100,China;Network Information Management Center,Ningxia University,Yinchuan Ningxia 750021,China)

机构地区:[1]银川能源学院,宁夏银川750100 [2]宁夏大学网络与信息管理中心,宁夏银川750021

出  处:《计算机仿真》2024年第7期474-478,共5页Computer Simulation

基  金:银川能源学院科研项目(2023-KY-Z-3);宁夏自然科学基金(2021aac03118)。

摘  要:非结构化数据的维度较高,每个样本数据包含的特征非常多,导致了维度灾难问题,使得降低维度并保持有效特征提取难度较大,影响大数据流量异常时间点挖掘的精度。为此,提出新的基于空间映射的非结构化高维大数据流量异常时间点挖掘方法。通过近似解集的几何特征建立稀疏回归模型,求解高维目标空间映射到低维目标子空间的稀疏投影矩阵。根据密度分布选择出一个高密度集合作为聚类中心的候选集,确定聚类的初始聚类中心。同时对聚类形成的各个簇采用剪枝算法,选择时间点候选集,对候选集展开二次判断,挖掘高维大数据流量异常时间点。实验结果表明,数据的降维能有效提高流量异常挖掘精度。相比之下,所提方法的高维大数据流量异常时间点挖掘更加精准,耗时更短。Generally,unstructured data has a high dimension.Each sample contains a large number of features,leading to dimensionality reduction,so it is difficult to maintain effective feature extr action.Therefore,a new method for mining abnormal time points in high-dime nsional unstructured big data traffic based on spatial mapping was put forward.First of all,a sparse regression mo del was built by using the geometric characte ristics of the approximate solution set.And then,the sparse projection matrix mapping from high-dimensional space to low-dimensional subspace was solved.Moreover,based on the density distribution,a high-density set was selected as the candidate set of the clustering center,thus determining the initial clustering center for clustering.Meanwhile,a pruning algorithm was applied to all the clusters.Furthermore,a can didate set of time points was selected.After that,a secondary judgment was performed on the candidate set.Finally,the abnormal time points in high-dimensional big data traffic were mined successfully.Experimental results prove that dimensionality reduction of data can effectively improve the mining accuracy of abnormal traffic.In comparison,the propo sed method is more accurate and time-efficient in mining abnormal time points of high-dimensional big data traffic.

关 键 词:非结构化数据 高维大数据 流量 异常时间点 挖掘方法 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象