基于联邦迁移的跨项目软件缺陷预测  被引量:1

Cross-project Software Defect Prediction Based on Federated Transfer

在线阅读下载全文

作  者:宋慧玲 李勇[1,2,3] 张文静 Song Huiling;Li Yong;Zhang Wenjing(College of Computer Science and Technology,Xinjiang Normal University,Urumqi 830054,China;Software Development Department,Xinjiang Electronic Research Institute,Urumqi 830010,China;Key Laboratory of Ministry of Industry and Information Technology for Safety-critical Software Development and Verification,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)

机构地区:[1]新疆师范大学计算机科学技术学院,新疆乌鲁木齐830054 [2]新疆电子研究所软件事业部,新疆乌鲁木齐830010 [3]南京航空航天大学高安全系统的软件开发与验证技术工信部重点实验室,江苏南京211106

出  处:《南京师大学报(自然科学版)》2024年第3期122-128,共7页Journal of Nanjing Normal University(Natural Science Edition)

基  金:新疆维吾尔自治区天山青年计划项目(2020Q019);新疆师范大学博士科研启动基金项目(XJNUBS1905).

摘  要:跨项目软件缺陷预测基于已标注的多源项目数据构建模型,可以解决软件历史数据不足和标注代价高的问题.但在传统跨项目缺陷预测中,源项目数据持有者为了保护软件数据的商业隐私,而导致的“数据孤岛”问题直接影响了跨项目预测的模型性能.本文提出基于联邦迁移的跨项目软件缺陷预测方法(FT-CPDP).首先,针对数据隐私泄露和项目间特征异构问题,提出基于联邦学习与迁移学习相结合的模型算法,打破各数据持有者间的“数据壁垒”,实现隐私保护场景下的跨项目缺陷预测模型.其次,在联邦通信过程中添加满足隐私预算的噪声来提高隐私保护水平,最后构建卷积神经网络模型实现软件缺陷预测.基于NASA软件缺陷预测数据集进行实验,结果表明与传统跨项目缺陷预测方法相比,本文提出的FT-CPDP方法在实现软件数据隐私保护的前提下,模型的综合性能表现较优.Cross-project software defect prediction is based on labeled multi-source project data to build a model,which can address the problem of insufficient software historical data and high labeling cost.However,in traditional cross-project defect prediction,the problem of“data-island”caused by source project data holders to protect the business privacy of software data directly affects the model performance of cross-project prediction.Therefore,in this paper,we propose a cross-project software defect prediction method based on federated transfer(FT-CPDP).Firstly,to address the problem of data privacy leaking and feature heterogeneity between projects,this paper presents a model algorithm based on the combination of federal learning and migratory learning to break down the“data barrier”among data holders,and to achieve cross-project defect prediction model in the privacy protection scenario.Secondly,in the federal communication process,the level of privacy protection is increased by adding noise that satisfies the privacy budget.Finally,a convolution neural network model is built to realize software defect prediction.Experiments based on NASA software defect prediction dataset show that compared with traditional cross-project defect prediction methods,FT-CPDP method achieves better comprehensive performance on the premise of software data privacy protection.

关 键 词:软件缺陷预测 联邦学习 迁移学习 差分隐私 卷积神经网络 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] TP311.5[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象