基于马尔可夫优化的高效用项集挖掘算法  

High utility itemset mining algorithm based on Markov optimization

在线阅读下载全文

作  者:钟新成 刘昶[2] 赵秀梅[1] ZHONG Xincheng;LIU Chang;ZHAO Xiumei(Department of Computer Science,Changzhi University,Changzhi Shanxi 046000,China;Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang Liaoning 110169,China)

机构地区:[1]长治学院计算机系,山西长治046000 [2]中国科学院沈阳自动化研究所,沈阳110169

出  处:《计算机应用》2023年第12期3764-3771,共8页journal of Computer Applications

基  金:2022年度山西省高等学校科技创新项目(2022L517)。

摘  要:基于树型和链表结构的高效用项集挖掘(HUIM)算法通常需要指数量级的搜索空间,而基于进化类型的挖掘算法未能充分考虑变量间的相互作用,因此提出一种基于马尔可夫优化的HUIM算法(HUIM-MOA)。首先,采用位图矩阵表示数据库和使用期望向量编码,以实现对数据库的快速扫描和效用值的高效计算;其次,通过计算优势个体间的互信息估计马尔可夫网络(MN)结构,并根据它们的局部特性使用吉布斯采样以产生新的种群;最后,为防止算法过快陷入局部最优和减少高效用项集的缺失,分别采用种群多样性保持策略和精英策略。在真实数据集上的实验结果表明,相较于次优的基于粒子群优化(PSO)的生物启发式HUI框架(Bio-HUIF-PSO)算法,在给定较大最小阈值的情况下,HUIM-MOA可以找到全部的高效用项集(HUI),收敛速度平均提升12.5%,挖掘HUI数平均提高2.85个百分点,运行时间平均减少14.6%。HUIM-MOA较进化型HUIM算法有更强的搜索性能,能有效减少搜索时间和提高搜索质量。To address the problems that the High Utility Itemset Mining(HUIM)algorithms based on tree and link table structures often consume search spaces of orders of magnitude,and the evolutionary type-based mining algorithms fail to fully consider the interactions between variables,an HUIM algorithm based on Markov Optimization(HUIM-MOA)was proposed.Firstly,a bitmap matrix for expressing database and expectation vector encoding were used to achieve fast scanning of the database and efficient computation of utility values,respectively.Then,the Markov Network(MN)structure was estimated by computing the mutual information among dominant individuals and new populations were generated by using Gibbs sampling according to their local characteristics.Finally,population diversity preservation strategy and elite strategy were used to prevent the algorithm from falling into local optimum too quickly and to reduce the missing of high utility itemsets,respectively.Experimental results on real datasets show that compared with Bio-inspired High Utility Itemset Framework based on Particle Swarm Optimization(Bio-HUIF-PSO)algorithm,HUIM-MOA can find all the High Utility Itemsets(HUIs)when given a larger minimum threshold,with on average 12.5%improvement in convergence speed,2.85 percentage point improvement in mined HUI number,and 14.6%reduction in running time.It can be seen that HUIM-MOA has stronger search performance than the evolutionary HUIM algorithm,which can effectively reduce the search time and improve the search quality.

关 键 词:高效用项集挖掘 马尔可夫网络 位图矩阵 吉布斯采样 精英策略 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象