检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:徐泽鑫 段立娟 王文健 恩擎 XU Ze-xin;DUAN Li-juan;WANG Wen-jian;EN Qing(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;Beijing Key Laboratory of Trusted Computing,Beijing 100124,China;National Engineering Laboratory for Critical Technologies of Information Security Classified Protection,Beijing 100124,China;Artificial Intelligence and Machine Learning(AIML)Lab,School of Computer Science,Carlton University,Ottawa K1S 5B6,Canada)
机构地区:[1]北京工业大学信息学部,北京100124 [2]可信计算北京市重点实验室,北京100124 [3]信息安全等级保护关键技术国家工程实验室,北京100124 [4]卡尔顿大学计算机学院,人工智能与机器学习实验室,加拿大渥太华K1S 5B6
出 处:《浙江大学学报(工学版)》2022年第11期2260-2270,共11页Journal of Zhejiang University:Engineering Science
基 金:国家自然科学基金资助项目(62176009,62106065);北京市教委重点项目(KZ201910005008).
摘 要:针对现有代码漏洞检测方法误报率和漏报率较高的问题,提出基于上下文特征融合的代码漏洞检测方法.该方法将代码特征解耦分为代码块局部特征和上下文全局特征.代码块局部特征关注代码块中关键词的语义及其短距离依赖关系.将局部特征融合得到上下文全局特征从而捕捉代码行上下文长距离依赖关系.该方法通过局部信息与全局信息协同学习,提升了模型的特征学习能力.模型精确地挖掘出代码漏洞的编程模式,增加了代码漏洞对比映射模块,拉大了正负样本在嵌入空间中的距离,促使对正负样本进行准确地区分.实验结果表明,在9个软件源代码混合的真实数据集上的精确率最大提升了29%,召回率最大提升了16%.A code vulnerability detection method based on contextual feature fusion was proposed in the view of high false positive rate and the high false negative rate of existing code vulnerability detection methods.The code features were decoupled into code block local features and context global features.The code block local features focused on the semantics of key tokens and short distance dependencies.The context global features were obtained by fusing code block local features to capture long-distance dependencies of code line context.The feature learning ability of the model was improved by collaborating the learning of local and global information.The programming mode of code vulnerabilities was discovered more accurately.A code vulnerability comparison mapping module was introduced to widen the distance between positive and negative samples in embedded space.The model can accurately distinguish between positive and negative samples.The experimental results show that the precision rate is improved by a maximum of 29%and the recall rate is improved by a maximum of 16%on the real data set mixed with 9 software source code.
关 键 词:代码漏洞检测 代码块局部特征提取 上下文全局特征融合 短距离依赖 长距离依赖
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7