DeepRanger:覆盖制导的深度森林测试方法  被引量:2

DeepRanger:Coverage-guided Deep Forest Testing Approach

在线阅读下载全文

作  者:崔展齐 谢瑞麟 陈翔[2] 刘秀磊[1] 郑丽伟[1] CUI Zhan-Qi;XIE Rui-Lin;CHEN Xiang;LIU Xiu-Lei;ZHENG Li-Wei(School of Computer Science,Beijing Information Science and Technology University,Beijing 100101,China;School of Information Science and Technology,Nantong University,Nantong 226019,China)

机构地区:[1]北京信息科技大学计算机学院,北京100101 [2]南通大学信息科学技术学院,江苏南通226019

出  处:《软件学报》2023年第5期2251-2267,共17页Journal of Software

基  金:江苏省前沿引领技术基础研究专项(BK20202001);国家自然科学基金(61702041,61601039);北京信息科技大学“勤信人才”培育计划(QXTCP C201906,QXTCP B201905)。

摘  要:深度学习软件的结构特征与传统软件存在明显差异,因此即使展开了大量测试,依然无法有效衡量测试数据对深度学习软件的覆盖情况和测试充分性,并造成后续使用过程中依然可能存在大量未知错误.深度森林是一种新型深度学习模型,其克服了深度神经网络存在的一些缺点,例如:需要大量训练数据、需要高算力平台、需要大量超参数.但目前还没有相关工作对深度森林的测试方法进行研究.针对深度森林的结构特点,制定了一组由随机森林结点覆盖率RFNC、随机森林叶子覆盖率RFLC、级联森林类型覆盖率CFCC和级联森林输出覆盖率CFOC组成的测试覆盖率评价指标.在此基础上,基于遗传算法设计了覆盖制导的测试数据自动生成方法DeepRanger,可自动生成能有效提高模型覆盖率的测试数据集.为对所提出覆盖指标的有效性进行验证,在深度森林开源项目gcForest和MNIST数据集上设计并进行了一组实验.实验结果表明,所提出的4种覆盖指标均能有效评价测试数据集对深度森林模型的测试充分性.此外,与基于随机选择的遗传算法相比,使用覆盖信息制导的测试数据生成方法DeepRanger能达到更高的模型覆盖率.Comparing with traditional software,the deep learning software has different structures.Even if a lot of test data is used for testing the deep learning software,the adequacy of testing still hard to be evaluted,and many unknown defects could be implied.The deep forest is an emerging deep learning model that overcomes many shortcomings of deep neural networks.For example,the deep neural network requires a lot of training data,high performance computing platform,and many hyperparameters.However,there is no research on testing deep forest.Based on the structural characteristics of deep forests,this study proposes a set of testing coverage criteria,including random forest node coverage(RFNC),random forest leaf coverage(RFLC),cascad forest class coverage(CFCC),and cascad forest output coverage(CFOC).DeepRanger,a coverage-oriented test data generation method based on genetic algorithm,is proposed to automatically generate new test data and effectively improve the model coverage of the test data.Experiments are carried out on the MNIST data set and the gcForest,which is an open source deep forest project.The experimental results show that the four coverage criteria proposed can effectively evaluate the adequacy of the test data set for the deep forest model.In addition,comparing with the genetic algorithm based on random selection,DeepRanger,which is guided by coverage information,can improve the testing coverage of the deep forest model under testing.

关 键 词:深度森林 测试覆盖准则 多粒度扫描覆盖 级联森林覆盖 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象