基于前馈神经网络的编译器测试用例生成方法  被引量:9

Compiler Fuzzing Test Case Generation with Feed-forward Neural Network

在线阅读下载全文

作  者:徐浩然 王勇军[1] 黄志坚 解培岱[1] 范书珲 XU Hao-Ran;WANG Yong-Jun;HUANG Zhi-Jian;XIE Pei-Dai;FAN Shu-Hui(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China;Institute of System Engineering,Academy of Military Sciences,Beijing 100097,China)

机构地区:[1]国防科技大学计算机学院,湖南长沙410073 [2]军事科学院系统工程研究院,北京100097

出  处:《软件学报》2022年第6期1996-2011,共16页Journal of Software

基  金:国家自然科学基金(61472439);国家重点研发计划(2018YFB0204301)。

摘  要:编译器模糊测试,是测试编译器功能性与安全性的常用技术之一.模糊测试器通过产生语法正确的测试用例,对编译器的深层代码展开测试.近来,基于循环神经网络的深度学习模型被引入编译器模糊测试用例生成过程.针对现有方法生成测试用例的语法正确率不足、生成效率低的问题,提出一种基于前馈神经网络的编译器模糊测试用例生成方法,并设计实现了原型工具FAIR.与现有的基于token序列学习的方法不同,FAIR从抽象语法树中提取代码片段,利用基于自注意力的前馈神经网络捕获代码片段之间的语法关联,通过学习程序设计语言的生成式模型,自动生成多样化的测试用例.实验结果表明,FAIR生成测试用例的解析通过率以及生成效率均优于同类型先进方法.该方法显著提升了检测编译器软件缺陷的能力,已成功检测出GCC和LLVM的20处软件缺陷.此外,该方法具有良好的可移植性,简单移植后的FAIR-JS已在JavaScript引擎中检测到两处软件缺陷.Compiler fuzzing is one of the commonly used techniques to test the functionality and safety of compilers. The fuzzer produces grammatically correct test cases to test the deep parts of the compiler. Recently, recurrent neural networks-based deep learning methods have been introduced to the test case generation process. Aiming at the problems of insufficient grammatical accuracy and low generation efficiency when generating test cases, a method for generating compiler fuzzing test cases is proposed based on feed-forward neural networks, and the prototype tool FAIR is designed and implemented. Different from the method based on token sequence learning, FAIR extracts code fragments from the abstract syntax tree, and uses a self-attention-based feed-forward neural network to capture the grammatical associations between code fragments. After learning a generative model of the programming language, fair automatically produce diverse test cases. Experimental results show that FAIR is superior to its competitors in terms of grammatical accuracy and generation efficiency of generating test cases. The proposed method has significantly improved the ability to detect compiler software defects, and has successfully detected 20 software defects in GCC and LLVM. In addition, the method has soundportability. The simple ported FAIR-JS has detected 2 defects in the JavaScript engine.

关 键 词:软件缺陷 编译器模糊测试 深度学习 前馈神经网络 抽象语法树 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象