基于Bert-TextCNN的开源威胁情报文本的多标签分类方法  被引量:1

Multi-label Classification Method of Open Source Threat Intelligence Text Based on Bert-TextCNN

在线阅读下载全文

作  者:陆佳丽 Lu Jiali(Beijing Topsec Network Security Technology Co.,Ltd.,Beijing 100193)

机构地区:[1]北京天融信网络安全技术有限公司,北京100193

出  处:《信息安全研究》2024年第8期760-768,共9页Journal of Information Security Research

摘  要:开源威胁情报对网络安全防护十分重要,但其存在着分布广、形式多、噪声大的特点.所以如何能对收集到的海量开源威胁情报进行高效的整理和分析就成为亟需解决的问题.因此,探索了一种以Bert-TextCNN模型为基础且同时考虑标题、正文和正则判断的多标签分类方法.根据情报源发布文本的特点,设置正则判断规则,以弥补模型的欠缺;为更全面反映开源威胁情报文本所涉及的威胁主题,针对标题和正文分别设置了Bert-TextCNN多标签分类模型,并将2部分标签整理去重以得到文本的最终威胁类别.通过与只依据正文建立的Bert-TextCNN多标签分类模型进行对比,所设置的模型在性能上有所提升,且召回率提升明显,能为开源威胁情报分类工作提供有价值的参考.Open source threat intelligence is very important for network security protection,but it has the characteristics of wide distribution,many forms and loud noise.Therefore,how to organize and analyze the collected massive open source threat intelligence efficiently has become an urgent problem to be solved.Therefore,this paper explores a multi-label classification method based on Bert-TextCNN model,considering the title,text,and regular judgment.According to the characteristics of the text published by the intelligence source,the article sets regular judgment rules to make up for the deficiency of the model.In order to fully reflect the threat topics involved in the open source threat intelligence text,the paper sets the Bert-TextCNN multi-label classification model for the title and the text respectively,and then re-sorts the two labels to get the final threat category of the text.Compared with the Bert-TextCNN multi-label classification model based on text only,the performance of the proposed model is improved,and the recall rate is significantly improved,which can provide valuable reference for the classification of open source threat intelligence.

关 键 词:开源威胁情报 多标签分类 文本分类 Bert模型 TextCNN模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象