基于NLP及特征融合的漏洞相似性算法评估  被引量:3

Vulnerability Similarity Algorithm Evaluation Based on NLP and Feature Fusion

在线阅读下载全文

作  者:贾凡[1] 康舒雅 江为强 王光涛 JIA Fan;KANG Shuya;JIANG Weiqiang;WANG Guangtao(School of Electronic and Information Engineering,Beijing Jiaotong University,Beijing 100044,China;Information Security Center,China Mobile Group Co.,Ltd.,Beijing 100053,China)

机构地区:[1]北京交通大学电子信息工程学院,北京100044 [2]中国移动通信集团有限公司信息安全管理与运行中心,北京100053

出  处:《信息网络安全》2023年第1期18-27,共10页Netinfo Security

基  金:教育部中国移动科研基金[MCM20200106]。

摘  要:漏洞相似性研究有助于安全研究人员从历史漏洞的信息中寻找新漏洞的解决方法。现有漏洞相似性研究工作开展不多,模型的选择也缺乏客观的实验数据支撑。文章将多种词嵌入技术与深度学习自编码器进行组合,从漏洞描述文本角度计算语义相似性。同时,结合从NVD等公共数据库提取的多维度特征数据,从漏洞特征角度计算漏洞特征相似性,并设计了一套基于NLP及特征融合的双角度漏洞相似性度量算法和评估方案。实验从数值分布、相似区分度和准确性等方面评估各种模型组合的效果,最优的模型组合在漏洞相似性判定中最高可获得0.927的F1分数。The study of vulnerability similarity helps security researchers to find solutions to new vulnerabilities from historical vulnerability information.The existing work on vulnerability similarity is not much,and the selection of its model is also lack of objective experimental data support.On this basis,this paper combined various word embedding technologies and deep learning auto-encoders to calculate semantic similarity from the perspective of vulnerability description text.At the same time,multi-dimensional feature data were extracted from public databases such as NVD,to calculate vulnerability feature similarity from the perspective of vulnerability features,and finally a dual angle vulnerability similarity measurement algorithm and evaluation scheme based on NLP and feature fusion was designed.Based on objective experimental analysis,the effects of various model combinations were compared from the aspects of numerical distribution,similarity discrimination,accuracy,etc.The final optimized model combination can obtain the highest F1 score of 0.927 in the determination of vulnerability similarity.

关 键 词:自然语言处理 深度学习 漏洞相似性 词嵌入 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象