一种基于闭项集的无冗余关联规则挖掘方法  被引量:2

Mining Non-Redundant Association Rules Based on Closed Itemsets

在线阅读下载全文

作  者:宋威[1] 高磊[1] 李晋宏[1] 

机构地区:[1]北方工业大学信息工程学院,北京100144

出  处:《北京交通大学学报》2009年第6期91-96,共6页JOURNAL OF BEIJING JIAOTONG UNIVERSITY

基  金:北京市市属高等学校人才强教计划项目;北方工业大学青年重点研究基金项目资助;北方工业大学博士科研启动基金项目资助

摘  要:针对关联规则挖掘中存在的规则数量过多,难于理解和应用的问题,提出了一种基于闭项集的无冗余关联规则挖掘算法.首先,给出了无冗余关联规则的定义,并基于规则信任度的概念说明了该定义的合理性;其次,在生成子、闭项集和无冗余关联规则的基础上,给出了无冗余最小-最大精确规则基和无冗余最小-最大近似规则基的定义,并讨论了它们的剪枝策略.最后,讨论了生成子的性质及连接策略,并在包含索引的基础上,给出了一种宽度优先的无冗余关联规则挖掘算法.实验结果表明,本文提出的算法不仅可以发现规模较小的无冗余关联规则,提高了挖掘结果的可理解性,而且具有较高的挖掘效率.Association rule mining often produces several tens of thousands of association rules, which causes the problem of understanding and applying the mining results. To solve this problem, an algorithm for mining non-redundant association rules based on closed itemset is proposed. Firstly, the concept of non-redundant association rule based on closed itemset is proposed, and the rationality of the concept is explained based on conviction. Then, based on generator, closed itemset and non-redundant association rule, the definitions of non-redundant min-max precise rule basis and non-redundant minmax approximate rule basis are proposed, and the corresponding pruning strategies are discussed. Finally, the characteristics and connection strategies of generator are presented, and based on subsume index, a breadth-first algorithm for mining non-redundant association rule is proposed. Experimental results show that the non-redundant rules with smaller sizes can be discovered. Thus, the understandability of mining result is improved. Furthermore, the proposed algorithm is also efficient.

关 键 词:数据挖掘 无冗余关联规则 生成子 闭项集 包含索引 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象