检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:师智斌[1] 孙文琦 窦建民 于孟洋 Shi Zhibin;Sun Wenqi;Dou Jianmin;and Yu Mengyang(School of Computer Science and Technology,North University of China,Taiyuan 030051;Third Research Institute of Ministry of Public Security,Shanghai 200031;North Navigation Control Technology Co.,Ltd.,Beijing 100176)
机构地区:[1]中北大学计算机科学与技术学院,太原030051 [2]公安部第三研究所,上海200031 [3]北方导航控制技术股份有限公司,北京100176
出 处:《信息安全研究》2025年第5期412-419,共8页Journal of Information Security Research
基 金:信息网络安全公安部重点实验室(公安部第三研究所)开放课题(C23600-06)。
摘 要:针对现有传统方法存在特征提取和表示受限、无法同时捕获API序列的空间语义特征和时序特征、无法捕获能决定目标任务的关键特征信息等问题,利用自然语言处理领域的词嵌入技术和多模型特征抽取以及特征融合技术,提出一种基于词嵌入和特征融合的恶意软件检测方法.首先使用自然语言处理领域的词嵌入技术对API序列编码,得到其语义特征编码表示;然后分别利用多重卷积网络和Bi-LSTM网络提取API序列的n-gram局部空间特征和时序特征;最后利用自注意力机制对捕获的特征进行关键位置信息的深度融合,通过刻画深层恶意行为特征实现分类任务.实验结果表明,在二分类任务中,该方法准确率达到94.79%,相较于传统机器学习方法平均提高了12.37%,比深度学习方法平均提高5.78%.在多分类任务中,该方法的准确率也达到91.95%,能够有效地提高对恶意软件的检测准确率.To address the limitations of traditional methods in feature extraction and representation,which are unable to simultaneously capture the spatial and temporal features of API sequences and fail to capture key features that determine the target task,a malware detection method based on word embedding and feature fusion has been proposed.First,the word embedding technology from the field of natural language processing is utilized to encode API sequences,obtaining their semantic feature representations.Then,multiple convolutional networks and Bi-LSTM networks are employed to extract n-gram local spatial features and temporal features of the API sequences,respectively.Finally,a self-attention mechanism is used to deeply fuse the captured features of critical positions,thereby achieving the classification task by characterizing deep malicious behavior features.Experimental results show that in binary classification tasks,the accuracy of this method reaches 94.79%,which is an improvement of 12.37%on average compared to traditional machine learning algorithms,and 5.78%higher on average compared to deep learning algorithms.In multi-class classification tasks,the accuracy of this model also reaches 91.95%,effectively enhancing the detection accuracy of malware.
关 键 词:恶意软件检测 软件调用序列 多重卷积网络 长短期记忆网络 特征融合
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.120