基于Apriori算法的混合型数据频繁项集挖掘算法  被引量:3

Hybrid Data Frequent Itemset Mining Algorithm Based on Apriori Algorithm

在线阅读下载全文

作  者:闫利霞 凌兴宏[2] 尼洪涛 YAN Li-xia;LING Xing-hong;NI Hong-tao(Computing Science&Artificial Intelligence College,Suzhou City University,Suzhou Jiangsu 215104,China;School of Computer Science&Technology,Soochow University,Suzhou Jiangsu 215006,China)

机构地区:[1]苏州城市学院计算科学与人工智能学院,江苏苏州215104 [2]苏州大学计算机科学与技术学院,江苏苏州215006

出  处:《计算机仿真》2023年第12期538-542,共5页Computer Simulation

基  金:江苏高校优势学科建设工程资助项目(93K172021K08)。

摘  要:由于混合型数据同时涉及离散型和连续型属性,导致计算复杂度较高,为此提出面向混合型数据的频繁项集挖掘算法,以提高计算效率。利用Apriori算法分析事务数据库内各项集之间相互关联关系,通过最小支持度计算结果制定关联度规则,生成无向图。建立邻接矩阵,并分析事务数据库内项集在邻接矩阵中的所处位置。将无向图内事务数据全部存储至邻接矩阵中,快速生成频繁1-项集、频繁2-项集;结合项集之间的连接操作,实现频繁项集的挖掘。引入滤波算法对不同存储链路中频繁项集的滤波处理,提高数据挖精准度。实验结果表明,所提方法在频繁项集的挖掘过程中,内存占用较小,频繁项集挖掘效率较高,对数据挖掘技术的发展具有重大意义。Due to the high computational complexity of mixed data involving both discrete and continuous attrib⁃utes,a frequent itemset mining algorithm for mixed data is proposed to improve computational efficiency.Firstly,Apriori algorithm was adopted to analyze the interrelationships between various sets in a transaction database,and then the association rule was formulated according to the minimum support calculation.Meanwhile,an undirected graph was generated.Moreover,an adjacency matrix was constructed,and the position of the item set of transaction database in the adjacency matrix was analyzed.After that,all the transaction data in the undirected graph were stored in the adjacency matrix,thus generating frequent itemset 1 and frequent itemset 2.According to the connection opera⁃tion between item sets,we achieved the frequent itemset mining.Finally,we used the filtering algorithm to filter fre⁃quent item-sets in different storage links,thus improving the data mining accuracy.Experimental results prove that the proposed method has smaller memory footprint and higher efficiency in mining frequent item-sets,which is of great significance for the development of data mining technology.

关 键 词:关联度规则 无向图 邻接矩阵 滤波算法 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象