检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周燕[1] 肖莉[1] ZHOU Yan;XIAO Li(College of Mathematics and Information,South China Agricultural University,Guangzhou 510642,China)
机构地区:[1]华南农业大学数学与信息学院,广东广州510642
出 处:《计算机工程与设计》2023年第1期108-115,共8页Computer Engineering and Design
基 金:国家社会科学基金面上基金项目(21BTJ057)。
摘 要:为解决传统关联聚类算法挖掘网络异常数据时间复杂度高、精确度不理想等问题,提出Spark-MML聚类算法。为Apriori关联规则算法设计并行化频繁项集挖掘环境,使用兴趣度约束与支持度自适应策略挖掘网络数据特征量强关联规则;利用可变网格的局部离群点检测算法剔除K-means聚类离群点,基于最大最小距离确定聚类中心及数值K,将网络数据分为异常和非异常。测试结果表明,该方法避免聚类中心选取陷入局部最优,降低了异常数据挖掘的时间复杂度,有效节约算法运行空间,是一种可靠的网络异常数据挖掘方法。To solve the problems of high time complexity and unsatisfactory accuracy of traditional association clustering algorithm for mining abnormal network data,Spark-MML clustering algorithm was proposed.A parallelized frequent itemset mining environment for Apriori association rule algorithm was designed,interest degree constraint and support degree adaptive strategy were used to mine strong association rules of network data features.Variable grid local outlier detection algorithm was used to eliminate K-means clustering class outliers.Based on the maximum and minimum distances,the cluster center and the value K were determined to divide the network data into abnormal and non-abnormal.The test results show that the proposed method avoids the cluster center selection from falling into local optimum,reduces the time complexity of abnormal data mining,and effectively saves the algorithm running space.It is a reliable method for network abnormal data mining.
关 键 词:关联规则 兴趣度 离群点 聚类 频繁项集 特征提取 异常数据
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229