基于ALBERT的网络威胁情报命名实体识别  被引量:1

Named entity recognition of network threat intelligence based on ALBERT

在线阅读下载全文

作  者:周景贤[1] 王曾琪 ZHOU Jing-xian;WANG Zeng-qi(Institute of Science and Technology Innovation,Civil Aviation University of China,Tianjin 300300,China;College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China)

机构地区:[1]中国民航大学科技创新研究院,天津300300 [2]中国民航大学计算机科学与技术学院,天津300300

出  处:《陕西科技大学学报》2023年第1期187-195,共9页Journal of Shaanxi University of Science & Technology

基  金:国家自然科学基金项目(U1533104);国家民航局民航安全能力建设项目(PESA2019074,PESA2021009);中央高校基本科研业务费中国民航大学专项项目(3122018C036)。

摘  要:网络威胁情报实体识别是网络威胁情报分析的关键,针对传统词嵌入无法表征一词多义而难以有效识别网络威胁情报实体关键信息,同时面临指数级增长的威胁情报,识别模型的效率亟待提高等问题,提出一种基于ALBERT的网络威胁情报命名实体识别模型.该模型首先使用ALBERT提取威胁情报动态特征词向量,然后将特征词向量输入到双向长短期记忆网络(BiLSTM)层得到句子中每个词对应的标签,最后在条件随机场(CRF)层修正并以最大概率输出序列标签.识别模型对比实验结果显示,提出模型的F1值为92.21%,明显优于其他模型.在识别准确率相同的情况下,提出模型的时间和资源成本也较低,适用于网络威胁情报领域海量高效的实体识别任务.Cyber threat intelligence entity identification is the key to cyber threat intelligence analysis.In view of the fact that traditional word embedding cannot represent the polysemy of a word,it is difficult to effectively identify the key information of cyber threat intelligence entities,and at the same time,facing the exponential growth of threat intelligence,the efficiency of the identification model needs to be improved urgently.A network threat intelligence named entity recognition model based on ALBERT is proposed.The model first uses ALBERT to extract threat intelligence dynamic feature word vectors.Then,the feature word vector is input to the bidirectional long short-term memory network(BiLSTM)layer to obtain the corresponding label of each word in the sentence.Finally,the conditional random field(CRF)layer is modified and the sequence label is output with the maximum probability.The experimental results of identification model comparison show that the F1 value of the proposed model is 92.21%,which is obviously better than other models.In the case of the same recognition accuracy,the time and resource costs of the proposed model are also lower,which is suitable for massive and efficient entity recognition tasks in the field of cyber threat intelligence.

关 键 词:网络威胁情报 命名实体识别 BERT ALBERT 双向长短期记忆网络 条件随机场 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象