检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘卓娴 王靖亚[1] 石拓[2] LIU Zhuoxian;WANG Jingya;SHI Tuo(Information and Network Security College,People’s Public Security University of China,Beijing 100038,China;Department of Public Security Management,Beijing Police College,Beijing 102202,China)
机构地区:[1]中国人民公安大学信息网络安全学院,北京100038 [2]北京警察学院公安管理系,北京102202
出 处:《信息网络安全》2024年第12期1922-1932,共11页Netinfo Security
基 金:北京市自然科学基金[9244025];国家社会科学基金重点项目[20AZD114]。
摘 要:恶意URL是一种用于定位网络资源的标识符,常被用于实施欺骗、勒索和窃取信息等恶意行为,是近年来多种网络攻击的重要媒介,给受害者造成了巨大损失。针对恶意URL攻击日益猖獗的现状,以及恶意URL本身特征复杂、混淆性强且欺骗性高的问题,同时考虑现有研究中特征提取不充分以及对模型鲁棒性和泛化能力关注不够的局限性,文章提出一种融合对抗训练与BERT-CNN-BiLSTM多通道神经网络的恶意URL检测模型。该模型将URL视为文本序列,利用BERT模型进行预处理,分别通过CNN层和Bi LSTM层提取局部语义特征和捕捉上下文语序特征,并通过FGM对抗训练方法对Embedding层施加扰动,从而提升模型的准确性和鲁棒性。在公开数据集上的实验结果表明,该模型在URL二分类任务中的分类准确率达到97.2%。消融实验和对比实验进一步验证了该模型在多个评价指标上的显著优势。此外,该模型在针对恶意URL更加精细化分类的任务中同样表现优异,在URL五分类任务中的分类准确率达到98.25%。Malicious URL are identifiers used to locate network resources and are frequently exploited to execute malicious activities such as fraud,extortion,and data theft.They have become critical mediums for numerous cyberattacks in recent years,causing significant harm to victims.Given the increasing prevalence of malicious URL attacks and the inherent complexity,ambiguity,and deceptive nature of malicious URL characteristics,along with the limitations of existing research in terms of insufficient feature extraction and inadequate focus on model robustness and generalization,this paper proposed a malicious URL detection model that integrates adversarial training with a BERT-CNN-BiLSTM multichannel neural network.The proposed model treated URLs as textual sequences,leveraging the BERT model for preprocessing to extract semantic features,followed by the CNN layer to capture local features and the BiLSTM layer to extract contextual sequential features.Furthermore,adversarial training using the Fast Gradient Method(FGM)introduced perturbations to the embedding layer,enhancing the model’s accuracy and robustness.Experimental results on public datasets demonstrate that the model achieves a classification accuracy of 97.2%on the binary classification task of URL detection.Ablation studies and comparative experiments further validate the model’s significant advantages across multiple evaluation metrics.Additionally,the model exhibits outstanding performance in fine-grained classification tasks of malicious URL,achieving a classification accuracy of 98.25%in a five-class URL classification task.
关 键 词:对抗训练 BERT 多通道神经网络 恶意URL检测
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.158