一种基于混合搜索的高效Top-K最频繁模式挖掘算法  被引量:2

An Efficient Mixed-searching-based Algorithm for Mining Top-K Most-frequent Patterns

在线阅读下载全文

作  者:敖富江[1] 杜静[2] 陈彬[1] 黄柯棣[1] 

机构地区:[1]国防科技大学机电工程与自动化学院,湖南长沙410073 [2]国防科技大学计算机学院,湖南长沙410073

出  处:《国防科技大学学报》2009年第2期90-93,共4页Journal of National University of Defense Technology

基  金:国家自然科学基金资助项目(6057305760704038)

摘  要:挖掘数据集中的Top-K最频繁模式具有重要意义。已有Top-K最频繁模式挖掘算法通常采用最频繁的k个项目作为初始项目,并将初始项目中频率最低的项目的支持度作为初始边界支持度。但实际组成Top-K最频繁模式的项目数目可能远少于k,从而制约了算法的效率。为此,提出了一种基于混合搜索方式的高效Top-K最频繁模式挖掘算法MTKFP。该算法首先利用宽度优先搜索获得少量的短项集,并利用短项集确定数目少于k的初始项目范围以及较高的初始边界支持度;然后利用深度优先搜索获得所有Top-K最频繁模式。实验表明,MTKFP算法所获得的初始项目数目至少低于已有算法70%,初始边界支持度高于已有算法;MTKFP算法的性能优于已有最好算法。It is significant to mine Top-K most-frequent patterns in data.set. The existing algorithms usually use the k-most frequent items as the initial items, and use the support of item with lowest frequency in initial items as the initial border support. In fact, since the number of items in Top-K rnost-frequent patterns is much less than k, the efficiency of the existing algorithms is restricted. To solve this problem, an efficient mixed-searching-based algorithm for mining Top-K most-frequent patterns, MTKFP is presented. The algorithm firstly mines some short item sets by breadth-first searching, and uses short item sets to obtain the scope of the initial items (the number of initial items is less than k) and the higher initial border support; then it obtains all Top-K most-frequent patterns by depth-first searching. The experimental results show that the number of initial items of MTKFP is 70% lower than that of existing algorithms, and the initial border support of MTKFP is higher than that of existing algorithms. Hence the performance of MTKFP is superior to that of the best existing algorithm.

关 键 词:Top-K最频繁模式 边界支持度 混合搜索 FP-TREE 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象