深度学习源代码缺陷检测方法  被引量:8

Source Code Defect Detection Based on Deep Learning

在线阅读下载全文

作  者:王晓萌 张涛[1] 辛伟[1] 侯长玉 WANG Xiao-meng;ZHANG Tao;XIN Wei;HOU Chang-yu(China Information Technology Security Evaluation Center,Beijing 100085,China)

机构地区:[1]中国信息安全测评中心

出  处:《北京理工大学学报》2019年第11期1155-1159,共5页Transactions of Beijing Institute of Technology

基  金:国家自然科学基金资助项目(U1636115,U1736209)

摘  要:针对由于传统的源代码缺陷分析技术依赖于分析人员的对安全问题的认识以及长期经验积累造成的缺陷检测误报率、漏报率较高的问题,提出了一种深度学习算法源代码缺陷检测方法.该方法根据深度学习算法,利用程序源代码的抽象语法树、数据流特征,通过训练源代码缺陷分类器完成源代码缺陷检测工作.其依据的关键理论是应用深度学习算法及自然语言处理中的词嵌套算法学习源代码抽象语法树和数据流中蕴含的深层次语义特征和语法特征,提出了应用于源代码缺陷检测的深度学习一般框架.使用公开数据集SARD对提出的方法进行验证,研究结果表明该方法在代码缺陷检测的准确率、召回率、误报率和漏报率方面均优于现有的检测方法.The development and progress of traditional source code defect analysis techniques rely mainly on analysts’ understanding of safety issues and long-term experience. To improve the quality of source code defect detection and report, a source code defect detection method was proposed based on deep learning algorithm. Firstly, introducing an abstract syntactic tree of program source code and the data stream features, and training source code defect sorter, the method was arranged to achieve source code defect detection according to the deep learning algorithm. And then,analyzing the abstract syntactic tree of source code and the semantic and syntactic feature contained in the data stream, a general framework was proposed for deep learning based source code defect detection according to the key theories, deep learning algorithm and word nesting algorithm in nature language processing. Finally, an open data set SARD was used to validate the proposed method. The experimental results show that, the proposed method can learn semantic and syntactic features hidden in the source code and outperform the existing methods in terms of accuracy, recall rate, false positive rate, and false negative rate.

关 键 词:缺陷检测 深度学习 静态分析 语义特征 语法特征 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象