标签集约束近似频繁模式的并行挖掘  被引量:7

Parallel mining on label-constraint proximity pattern

在线阅读下载全文

作  者:郑海雁[1,2] 王远方[2] 熊政[1] 李昆明[1] 崇志宏[2] 尹飞[1] 

机构地区:[1]江苏方天电力技术有限公司智能电网产品中心,南京211189 [2]东南大学计算机科学与工程学院,南京211189

出  处:《计算机工程与应用》2015年第9期135-141,共7页Computer Engineering and Applications

基  金:国家自然科学基金(No.60973023)

摘  要:近似频繁模式衍生于频繁模式,综合了频繁项集与频繁子图的特点。针对该模式的研究集中在无标签图上,其应用场景主要为社交网络、语义网络、智能电网等。近似频繁模式挖掘过程同时涉及频繁项集挖掘和频繁子图挖掘,因此已有的处理频繁模式挖掘算法无法较好地解决近似频繁模式挖掘问题。基于近似频繁模式结构,将其拓展到带标签图中,引入标签集约束,并设计标签集约束近似频繁模式挖掘算法LCPP(Label-Constraint Proximity Pattern),该算法并行部署在Map Reduce计算模型中,弥补了开源p FP算法处理大规模数据时效率不高的缺点。实验结果验证了该算法的有效性和可扩展性,表明了LCPP算法是p FP算法的极佳补充。Proximity pattern is derived from frequent pattern, characterized by a combination of frequent items and fre-quent subgraphs. Research about proximity pattern is mainly concentrated on the unlabeled graph, and the main application scenarios are social network, semantic Web and smart grid, etc. Proximity pattern mining process involves both frequent items mining and frequent subgraph mining, therefore the existing methods of pattern mining can not be used directly on the issue. On the basis of the proximity pattern, this paper introduces the LCPP(Label-Constraint Proximity Pattern)algo-rithm during the label graph. The algorithm is deployed in the MapReduce parallel computing model, making up for the inefficiency of pFP algorithm when processing the large-scale database. The experimental results show that the parallel algo-rithm can not only improve the computing speed, but also has good scalability, and the LCPP algorithm is an excellent complement of pFP.

关 键 词:近似频繁模式 标签集约束 并行化 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象