FP-growth算法的优化  被引量:1

Optimization of FP-growth Algorithm

在线阅读下载全文

作  者:闫越[1] 姜昌金[1] 

机构地区:[1]东南大学自动化学院,江苏南京210096

出  处:《信息技术与信息化》2013年第6期125-128,共4页Information Technology and Informatization

摘  要:FP-growth算法是关联规则挖掘中效率较高的算法,以自底向上方式探索树,由FP树产生频繁项集。本文针对FP树构造过程中需多次遍历频繁项列表L的缺点,提出了一种基于散列表的改进算法,实现了项名称关键字到存储地址的映射,进而实现了项名称关键字到其支持度计数的映射。在查找某项的支持度计数时,只需给出其名称关键字,无需从头遍历频繁项列表L,时间复杂度由O(n)提高到O(1)。实验结果表明,改进算法的性能优于原算法,节省了遍历时间,提高了挖掘效率。FP-growth is one of the most efifcient algorithm among all the association rule algorithms.It is a kind of algorithms that explores the FP-tree by a bottom-up way,then it generates frequent items by mining the FP-tree. This article puts forward an optimizing algorithm which is based on hash table against the defects during the process of FP-tree construction,because it usually traverses the frequent item table L time and time again.The new algorithm has achieved a mapping of a key name to the storage address,thus it also achieved a mapping of a key name to its supporting number.As a result, just give an item-key or item-name when you want to search the supporting number of an item. The hash function will help you to calculate the logical address according to the item-key you provided,you will obtain the supporting number directlly according to the logical address.There is no point at all in traversing the frequent item table L.Obviously,time complexity of searching one supporting number of an item improves from O(n) to O(1).At last,experimental results show that optimizing algorithm is indeed better than the original algorithm in terms of the running time.It spends less time than the original one.It saves traversal time and improvs mining efifciency.

关 键 词:数据挖掘 散列表 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象