检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:马栋林[1] 陈伟杰 赵宏[1] 宋佳佳 Ma Donglin;Chen Weijie;Zhao Hong;Song Jiajia(School of Computer Science and Communication,Lanzhou University of Technology,Lanzhou 730050,China)
机构地区:[1]兰州理工大学计算机与通信学院,兰州730050
出 处:《电子测量技术》2024年第20期15-23,共9页Electronic Measurement Technology
基 金:国家自然科学基金(62166025)项目资助。
摘 要:针对当前恶意URL检测模型在处理复杂结构和多样化字符组合的URL时,存在特征提取单一和检测精度不高的问题,提出了一种基于多尺度注意力特征融合的恶意URL检测模型。首先,采用Character Embeddings和DistilBERT方法分别对字符和单词进行编码,以捕获URL字符串中字符级和词级特征表示。其次,通过改进卷积神经网络(CNN)提取不同尺度的字符结构特征和词级语义特征,并结合双向长短期记忆网络(BiLSTM)进一步提取深层次序列特征。此外,为了实现字符级与词级多尺度特征的动态融合,创新性地引入注意力特征融合模块(AFF),有效降低信息冗余并提升对长距离序列特征的提取能力。实验结果表明,所提模型与其他基准模型相比,准确率提升了0.32%~4.7%,F1分数提升了0.46%~5.5%,并在ISCX-URL2016等数据集上也达到了较好的测效果。To address the issues of single feature extraction and low detection accuracy in current malicious URL detection models when handling URLs with complex structures and diverse character combinations,this paper proposes a malicious URL detection model based on multi-scale attention feature fusion.First,Character Embeddings and DistilBERT are employed to encode characters and words separately,capturing both character-level and word-level feature representations in URL strings.Next,an improved convolutional neural network(CNN)is used to extract multi-scale character structural features and word-level semantic features,while a bidirectional long short-term memory(BiLSTM)network is employed to further extract deep sequence features.Additionally,an innovative attention feature fusion(AFF)module is introduced to dynamically fuse multi-scale features at both the character and word levels,effectively reducing information redundancy and enhancing the extraction of long-range sequence features.Experimental results show that the proposed model outperforms other baseline models,with accuracy improvements ranging from 0.32%to 4.7%and F1 score improvements from 0.46%to 5.5%,achieving excellent detection performance on datasets such as ISCX-URL2016.
关 键 词:恶意URL检测 多尺度特征 卷积神经网络 双向长短时记忆网络 注意力特征融合
分 类 号:TN391[电子电信—物理电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.112