一种基于主属性判定的关联规则挖掘约简算法  被引量:7

An association rule mining reduction algorithm based on determining prime attributes

在线阅读下载全文

作  者:熊中敏[1] 汪博 陶然 郑宗生[1] 陈明[1,2] XIONG Zhong-min;WANG Bo;TAO Ran;ZHENG Zong-sheng;CHEN Ming(College of Information Technology,Shanghai Ocean University,Shanghai,201306;Key Laboratory of Fisheries Information,Ministry of Agriculture,Shanghai 201306,China)

机构地区:[1]上海海洋大学信息学院,上海201306 [2]农业部渔业信息重点实验室,上海201306

出  处:《计算机工程与科学》2021年第4期738-745,共8页Computer Engineering & Science

基  金:国家自然科学基金(61702325);上海市科技创新行动计划(16391902902)。

摘  要:关联规则挖掘是经典的数据挖掘方法,越来越多的企业都把它看作是必不可少的战略分析工具。当前关联规则挖掘方法得到的规则过多,令用户在运用时难以理解,因此研究关联规则集的约简方法具有应用价值。研究了数据库模式中关键字包含的主属性对基于Apriori算法的关联规则挖掘产生的关联规则的影响,即部分函数依赖会导致关联规则挖掘的数据集中冗余信息的频繁出现,并产生没有实际价值的关联规则,识别并消除这样的规则就能实现规则集的约简。求全部主属性如同求所有候选关键字问题都是NP难题,因此提出了一种基于一个候选关键字进行验证的算法来判定主属性,从而完成基于主属性判定的关联规则挖掘约简算法的设计与实现,并在最后的实验中验证了该算法的有效性。Association rule mining is a classic data mining method,and more and more companies regard it as an indispensable strategic analysis tool.The current association rule mining method has too many rules,which makes it difficult for users to understand in the application.Therefore,the research on the reduction method of the association rule set has application value.This paper studies the influence of the prime attributes contained in the keywords of the database schema on the association rules generated by the association rule mining based on the Apriori algorithm.It is found that partial functional dependence will cause the frequent appearance of redundant information in the data set of association rule mining and produce no practical value.Recognition and elimination of such rules can realize the reduction of the rule set.As the same with finding all the candidate keywords,finding all the prime attributes is an NP problem.Therefore,this paper studies a verification method based on a candidate keyword to determine the prime attributes,so as to design and implement the association rule mining reduction algorithm based on the determination of the prime attributes.Finally,the effectiveness of the method is ve-rified in the experiment.

关 键 词:数据挖掘 关联规则 主属性 关联关系 算法优化 

分 类 号:TP311.131[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象