检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《北京交通大学学报》2009年第6期91-96,共6页JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基 金:北京市市属高等学校人才强教计划项目;北方工业大学青年重点研究基金项目资助;北方工业大学博士科研启动基金项目资助
摘 要:针对关联规则挖掘中存在的规则数量过多,难于理解和应用的问题,提出了一种基于闭项集的无冗余关联规则挖掘算法.首先,给出了无冗余关联规则的定义,并基于规则信任度的概念说明了该定义的合理性;其次,在生成子、闭项集和无冗余关联规则的基础上,给出了无冗余最小-最大精确规则基和无冗余最小-最大近似规则基的定义,并讨论了它们的剪枝策略.最后,讨论了生成子的性质及连接策略,并在包含索引的基础上,给出了一种宽度优先的无冗余关联规则挖掘算法.实验结果表明,本文提出的算法不仅可以发现规模较小的无冗余关联规则,提高了挖掘结果的可理解性,而且具有较高的挖掘效率.Association rule mining often produces several tens of thousands of association rules, which causes the problem of understanding and applying the mining results. To solve this problem, an algorithm for mining non-redundant association rules based on closed itemset is proposed. Firstly, the concept of non-redundant association rule based on closed itemset is proposed, and the rationality of the concept is explained based on conviction. Then, based on generator, closed itemset and non-redundant association rule, the definitions of non-redundant min-max precise rule basis and non-redundant minmax approximate rule basis are proposed, and the corresponding pruning strategies are discussed. Finally, the characteristics and connection strategies of generator are presented, and based on subsume index, a breadth-first algorithm for mining non-redundant association rule is proposed. Experimental results show that the non-redundant rules with smaller sizes can be discovered. Thus, the understandability of mining result is improved. Furthermore, the proposed algorithm is also efficient.
关 键 词:数据挖掘 无冗余关联规则 生成子 闭项集 包含索引
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.158