频繁量化模式图挖掘及应用  

Mining and Application of Frequent Patterns with Counting Quantifiers

在线阅读下载全文

作  者:沙雨济 王欣 何艳潇 钟学燕 方宇[1] SHA Yuji;WANG Xin;HE Yanxiao;ZHONG Xueyan;FANG Yu(School of Computer Science,Southwest Petroleum University,Chengdu 610500,China)

机构地区:[1]西南石油大学计算机科学学院,成都610500

出  处:《计算机科学》2023年第S02期565-576,共12页Computer Science

摘  要:频繁模式挖掘(FPM)是图数据研究领域的一个经典问题,单一大图上的FPM问题近年来受到了更加广泛的关注。该问题被定义为根据用户给定的频率阈值查找在大图(Graph)中频繁出现的所有模式图(Pattern)。近年来,人们见证了FPM在多个领域的广泛应用,例如社交网络分析、欺诈检测等。然而,面对新兴的应用需求,人们需要更具语义表达力的模式图及其挖掘技术。为此,在传统模式图的基础上,首先提出了量化模式图(Quantified Graph Patterns,QGPs)——一类具有计数量词约束的模式图,实现了模式图语义的扩展;其次设计了一种在分布式场景下挖掘QGPs的算法,提出了量化图模式关联规则(Quantified Graph Pattern Association Rules,QGPARs)及其挖掘技术,用于预测(社交)网络中实体之间的潜在联系,然后利用真实图和合成图数据,通过翔实的实验验证了QGPs挖掘算法的计算效率,通过与经典链接预测方法进行对比,发现QGPARs可以取得更高的链接预测准确性;最后通过与传统图模式关联规则(Graph Pattern Association Rules,GPARs)的链接预测结果进行对比,验证了QGPARs与GPARs之间在链接预测结果方面存在显著差异,也进一步验证了QGPARs在链接预测中的有效性。Frequent pattern mining(FPM)is a classical problem in graph theory,more attention has been paid on FPM on single large graphs,which is defined as discovering all the pattern graphs Q with occurrence frequencies above a user defined threshold,in a single large graph G.In recent years,people have witnessed wide applications of FPM,such as social network analysis and fraud detection.However,emerging applications keep calling for more expressive pattern graphs along with their mining techniques to capture more complex structures in a large graph.In light of this,we incorporate counting quantifiers in pattern graphs and introduce quantified pattern graphs(QGPs)which are able to express richer semantics.We then develop a distributive algorithm to mine QGPs in parallel.Furthermore,we introduce quantified graph pattern association rules(QGPARs)for linking prediction on large graphs.We conduct experimental studies to validate the computational efficiency of the QGPs mining algorithm by using real-world and synthetic graph data.By comparing with prior link prediction methods,we find that prediction with QGPARs achieves even higher accuracy.Finally,by comparing with the link prediction results of traditional graph pattern association rules,we verify that there is a significant difference between QGPARs and GPARs in terms of link prediction results,and further verify the effectiveness of QGPARs in link prediction.

关 键 词:量化模式图 频繁模式挖掘 分布式挖掘 量化图模式关联规则 链接预测 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象