基于多线程技术的数据流频繁模式挖掘  

Frequent pattern mining over data streams by multi-thread techniques

在线阅读下载全文

作  者:周兴华[1] 陆建峰[1] 汤九斌[2] 

机构地区:[1]南京理工大学计算机科学与工程学院,南京210094 [2]中国电信江苏公司,南京210037

出  处:《计算机应用》2013年第A01期69-72,共4页journal of Computer Applications

基  金:江苏省自然科学基金资助项目(BK2009489);江苏省青蓝工程

摘  要:用事务敏感滑动窗口挖掘频繁项集算法MFI-TransSW采用比特序列实现滑动窗口操作,能够显著地降低空间和时间成本,但是由于Aprior系列算法的限制产生巨大的候选集,使得该算法在频繁模式生成阶段的运行效率偏低。针对MFI-TransSW算法的上述不足,提出了一种基于窗口划分成固定数目段的多线程算法MFI-MultiSW。MFI-MultiSW算法采用线性链表结构存储当前候选项集和窗口内事务的信息,并在线性链表的基础上采用多线程方法生成频繁模式。实验结果表明,相比原算法,改进算法在多核处理器环境下能成倍提高执行效率。Mining Frequent Itemsets within a Transaction-sensitive Sliding Window (MFI-TranSW) algorithm uses a sequence of bits to implement the sliding window operation, which can remarkably reduce the memory consumption and the mining time. Nevertheless, the run-time efficiency of the algorithm in frequent pattern generation phase is low due to the limitations of the Aprior series of algorithms to generate huge candidate set. In the view of the above insufficiency of the MFI- TransSW algorithm, a new multi-thread algorithm named Mining Frequent Itemsets within a Multithreaded Sliding Window (MFI-MUltiSW) was proposed based on the window which was divided into a fixed number of batches of transactions. The linear linklist was used in MFI-MUltiSW algorithm to store the information about the current candidate itemsets and the transactions in the current window, and a multi-threaded approach was used to generate frequent patterns based on the linear inklist. The experimental results show that the improved algorithm can multiply the performance in a multi-core processor environment compared to the original algorithm.

关 键 词:数据流 频繁模式 滑动窗口 线性链表 多线程 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象