基于DNN模型输出差异的测试输入优先级方法  

Test Input Prioritization Approach Based on DNN Model Output Differences

在线阅读下载全文

作  者:朱进 陶传奇 郭虹静[1] ZHU Jin;TAO Chuanqi;GUO Hongjing(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China;Ministry Key Laboratory for Safety-Critical Software Development and Verification,Nanjing 210016,China;State Key Laboratory for Novel Software Technology,Nanjing 210023,China;Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210016,China)

机构地区:[1]南京航空航天大学计算机科学与技术学院,南京210016 [2]高安全系统的软件开发与验证技术工信部重点实验室,南京210016 [3]计算机软件新技术国家重点实验室,南京210023 [4]软件新技术与产业化协同创新中心,南京210016

出  处:《计算机科学》2024年第S01期818-825,共8页Computer Science

摘  要:深度神经网络测试需要大量的测试数据来保证DNN的质量,但大多数测试输入缺乏标注信息,而且对测试输入进行标注会带来高昂的人工代价。为了解决标注成本的问题,研究人员提出了测试输入优先级方法,筛选高优先级的测试输入进行标注。然而,大多数优先级方法都受到有限情景的影响,例如难以筛选出高置信度的误分类输入。为了应对上述挑战,文中将差分测试技术应用于测试输入优先级,并提出了基于DNN模型输出差异的测试输入优先级方法(DeepDiff)。DeepDiff首先构建一个与原始模型具有相同功能的差分模型,然后计算测试输入在原始模型与差分模型之间的输出差异,最后为输出差异较大的测试输入分配更高的优先级。在实验验证中,我们对4个广泛使用的数据集和相应的8个DNN模型进行了研究。实验结果表明,在原始测试集上,DeepDiff的有效性比基线方法平均高出13.06%,在混合测试集上高出39.69%。Deep neural network(DNN)testing requires a large amount of test data to ensure the quality of DNN.However,most test inputs lack annotation information,and annotating test inputs is costly.Therefore,in order to address the issue of annotation costs,researchers have proposed a test input prioritization approach to screen high priority test inputs for annotation.However,most prioritization methods are influenced by limited scenarios,such as difficulty in filtering out high confidence misclassified inputs.To address the above challenges,this paper applies differential testing technology to test input prioritization and proposes a test input prioritization method based on DNN model output differences(DeepDiff).DeepDiff first constructs a contrast model that has the same functionality as the original model,then calculates the output differences between the test inputs on the original model and the contrast model,and finally assigns higher priority to the test inputs with larger output differences.For empirical evidence,we conduct a study on four widely used datasets and the corresponding eight DNN models.Experimental results demonstrate that DeepDiff is 13.06%higher on average in effectiveness compared to the baseline approaches on the original test set and 39.69%higher on the mixed test set.

关 键 词:深度神经网络测试 测试输入优先级 差分测试 模型输出差异 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象