基于图神经网络的软件系统中关键类的识别  被引量:2

Identification of Key Classes in Software Systems Based on Graph Neural Networks

在线阅读下载全文

作  者:张健雄 宋坤 何鹏 李兵[3] ZHANG Jian-xiong;SONG Kun;HE Peng;LI Bing(School of Computer and Information Engineering,Hubei University,W uhan 430062,China;Hubei Provincial Key Laboratory of Applied Mathematics,Wuhan 430062,China;School of Computer Science,W uhan University,Wuhan 430072,China)

机构地区:[1]湖北大学计算机与信息工程学院,武汉430062 [2]应用数学湖北省重点实验室,武汉430062 [3]武汉大学计算机学院,武汉430072

出  处:《计算机科学》2021年第12期149-158,共10页Computer Science

基  金:国家重点研发计划(2018YFB1003801);国家自然科学基金(61832014,61902114,61977021);湖北省科技重大专项(2019ACA144);应用数学湖北省重点实验室开放基金(HBAM201901)。

摘  要:软件系统中通常存在一些在拓扑结构上处于核心位置的关键类,这些类上的缺陷往往会给系统带来极大的安全隐患,识别关键类对工程师理解或维护一个软件系统至关重要。针对这一问题,提出一种基于图神经网络的关键类识别方法。首先利用复杂网络理论,将软件系统抽象为软件网络;其次结合无监督网络节点嵌入学习以及邻域聚合的方式,构建一个编码-解码(encoder-decoder)框架,提取软件系统中类节点的表征向量;最后利用Pairwise排序学习实现网络中节点的重要性排序,从而实现软件系统中关键类的识别。为验证所提方法的有效性,选取4个Java开源软件作为实验对象,并与常用的5种节点重要性度量方法以及2个已有工作进行对比分析。实验结果表明:与介数中心性、K-core、接近中心性、节点收缩法和PageRank等方法相比,该方法识别关键类的效果更好;另外,相比已有工作,在前15%的关键类节点中,所提方法的召回率和准确率的提高幅度均在10%以上。There are usually some key classes which are in the core position in the topology structure of software systems.The defects in these classes will bring great security risks to the system.Therefore,it is very important to identify these key classes for engineers to understand or maintain an unfamiliar software system.To do this,the paper proposes a novel method of identifying key classes based on graph neural networks.Specifically,the software system is abstracted as software network by using complex network theory,and then by combining unsupervised network embedding learning and neighborhood aggregation mode,we construct an encoder-decoder framework to extract the representation vector of class nodes in software system.Finally,according to the obtained node representations,Pairwise learning-to-rank algorithm is adopted to realize the importance ranking of nodes,so as to achieve the identification of key classes in software system.In order to verify the effectiveness of our method,an empirical analysis of four object-oriented Java open-source software is done,and we compare it with five commonly used node importance measurement methods and two existing works.The experimental results show that,compared with node centrality,K-core and PageRank,the proposed method is more effective in identifying key classes from the perspective of network robustness.In addition,on the existing public labeled dataset,the recall and precision of this paper are better at the top 15%percent of nodes,and improved by more than 10%.

关 键 词:软件网络 关键类识别 网络嵌入 图神经网络 排序学习 

分 类 号:TP311.53[自动化与计算机技术—计算机软件与理论] TP183[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象