基于BERT的漏洞文本特征分类技术研究  被引量:5

Research on Vulnerability Text Feature Classification Technology Based on BERT

在线阅读下载全文

作  者:杜林 许传淇 Du Lin;Xu Chuanqi(Tianjin Branch of National Computer Network Emergency Res ponse Technical Team/Coordination Center of China,Tianjin 300100;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 10004)

机构地区:[1]国家计算机网络应急技术处理协调中心天津分中心,天津300100 [2]北京交通大学计算机与信息技术学院,北京100044

出  处:《信息安全研究》2023年第7期687-692,共6页Journal of Information Security Research

摘  要:随着信息化的发展和网络应用的增多,许多软硬件产品受到各种类型的网络安全漏洞影响.漏洞分析和管理工作往往需要对大量漏洞情报文本进行人工分类.为了高效准确地判断漏洞情报文本所描述漏洞的类别,提出了一种基于多层双向Transformer编码器表示(bidirectional encoder representation from Transformers,BERT)的网络安全漏洞分类模型.首先,构建漏洞分类数据集,用预训练模型对漏洞情报文本进行特征向量表示.然后,将所得的特征向量通过分类器完成分类.最后,使用测试集对分类效果进行评估.实验共使用了48000个包含漏洞描述的漏洞情报文本,分别用TextCNN,TextRNN,TextRNN_Att,fastText和所提模型进行分类.实验结果表明,所提模型在测试集上的分类评价指标得分均为最高,能够有效应用于网络安全漏洞分类任务,降低人工工作量.With the development of informatization and the increase of network applications,many software and hardware products are affected by various types of cybersecurity vulnerabilities.Vulnerability analysis and management often require people to classify large amounts of vulnerability intelligence texts.In order to efficiently and accurately determine the category of the vulnerability described by the vulnerability intelligence text,this paper proposes a cybersecurity vulnerability classification model based on BERT(bidirectional encoder representation from Transformers).First,the vulnerability classification dataset is constructed,and the pre-trained model represents the vulnerability intelligence text as feature vectors.Then the feature vectors complete the classification through the classifier.At last,we use the test set to evaluate the classification effect.In our experiment,we use TextCNN,TextRNN,TextRNN_Att,fastText and the proposed model to classify 48000 vulnerability intelligence texts containing vulnerability descriptions.Experimental results show that the proposed model scored the highest on the classification evaluation indicators on the test set,and it can be effectively applied to cybersecurity vulnerability classification tasks and reduce manual workload.

关 键 词:自然语言处理系统 网络安全 特征抽取 分类器 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象