基于Bitmap的序列模式挖掘的改进算法  

An Improved Algorithm for Mining Sequential Pattern Based on Bitmap

在线阅读下载全文

作  者:王红侠[1] 胡学钢[1] 

机构地区:[1]合肥工业大学计算机与信息学院,安徽合肥230009

出  处:《计算机技术与发展》2007年第8期84-87,91,共5页Computer Technology and Development

摘  要:结合BBSP,提出了一种称做最终位置归纳序列模式挖掘(LPI-SPM)的新算法,该算法可以有效地从大型数据库中获取所有的频繁序列模式。该策略与以前工作的不同点在于:当判断一个序列是否是模式时,通过扫描数据库创建S-矩阵来实现(PrefixSpan)或者通过对候选项进行交运算(SPADE)或并运算(BBSP)统计其数量来实现。相反,在基于下列事实的基础上LPI-SPN会很容易实施这一过程,即若一个项的最终位置小于当前前缀位置,在相同的顾客序列中,该项就不会出现在当前前缀的后面。LPI-SPM在序列挖掘过程中可以大大缩减搜索空间,而且挖掘序列模式的效力可观。实验结果表明,在各种数据集合中LPI-SPM胜过BBSP三倍。In this paper, by combining BBSP(bitmap based sequential patterns), propose a new algorithm called Last Position Induction Sequential Pattern Mining (LPI - SPM), Which can efficiently get all the frequent sequential patterns from a large database. The main difference between our strategy and the previous works is that when judging whether a sequence is a pattern or not, they use S- Matrix by scanning projected database (PrefixSpan) or count the number by joining (SPADE) or ANDing with the candidate item (BBSP). In contrast, LPI - SPM can easily implement this process based on the following fact - if an item's last position is smaller than the current prefix position, the item can not appear behind the current prefix in the same customer sequence. LPI- SPM could largely reduce the search space during raining process and is considerable effectiveness in mining sequential pattern. Our experimental results show that LPI - SPM outperforms BBSP up to three times on all kinds of dataset.

关 键 词:KDD 位图 序列模式 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP301.6[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象