检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:石俊豪 王欣 邹杰军 方宇 蒋星 Shi Junhao;Wang Xin;Zou Jiejun;Fang Yu;Jiang Xing(School of Computer Science and Software Engineering,Southwest Petroleum University,Chengdu,610500,China)
机构地区:[1]西南石油大学计算机与软件学院,成都610500
出 处:《南京大学学报(自然科学版)》2024年第4期552-565,共14页Journal of Nanjing University(Natural Science)
基 金:国家自然科学基金(62172102);四川省科技创新人才基金(2022JDRC0009)。
摘 要:图采样通过对图数据进行约简操作,获得比原图的规模更小的图结构,进而服务于图谱分析、图可视化等下游任务.现有的图采样算法侧重于保留图中显著的结构特征而忽略了节点属性,导致采样图在许多下游任务如频繁模式挖掘等,难以取得预期效果.为此,提出基于Motif的节点有偏采样算法(Motif-Based Node Biased Sampling,MNBS),利用频繁Motif结构重新定义图中节点的重要性,随后进行有偏节点采样,实现融合节点属性与结构特征的采样.为了快速识别频繁Motif模式,设计了具有“提前终止”特性的Motif模式快速发现算法(Fast Motif-Pattern Discovery,FMPD),能高效且准确地发现Motif模式以支持图采样.实验表明,MNBS采样算法在多项指标上优于其他基线算法,其对数归一化累积组相关性指标平均降低0.54,使用包含“提前终止”特性策略的FMPD算法的时间消耗和内存消耗比基线算法分别降低56.1%和29.8%.Graph sampling obtains graph structures of smaller size compared to the original graph by performing approximation operations on graph data,and thus serves downstream tasks such as graph analysis and graph visualisation.Existing graph sampling algorithms focus on preserving salient structural features in the graph and ignore node attributes,leading to difficulties in achieving the expected results of sampled graphs in many downstream tasks,such as frequent pattern mining.For this reason,this paper proposes Motif⁃Based Node Biased Sampling(MNBS),an algorithm that redefines the importance of nodes in the graph using frequent Motif substructures,followed by biased node sampling,achieving sampling that fuses node attributes with structural features.In order to quickly identify frequent Motif patterns,Fast Motif⁃Pattern Discovery(FMPD)algorithm with"early termination"is designed to efficiently and accurately discover Motif patterns to support graph sampling.Experiments show that the MNBS sampling algorithm outperforms the other baseline algorithms in a number of metrics.For example,the average reduction of Logarithm Normalized Cumulative Group Relevance is 0.54,and FMPD algorithm with the"early termination"feature reduces the time and memory consumption by 56.1%and 29.8%,respectively.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7