基于VAE与API行为特征抽取的恶意软件检测  

Malware detection based on variational autoencoder and API behavior feature extraction

在线阅读下载全文

作  者:于孟洋 师智斌[1] 郝伟泽 张舒娟[1] YU Meng-yang;SHI Zhi-bin;HAO Wei-ze;ZHANG Shu-juan(School of Computer Science and Technology,North University of China,Taiyuan 030051,China)

机构地区:[1]中北大学计算机科学与技术学院,山西太原030051

出  处:《计算机工程与设计》2025年第2期464-471,共8页Computer Engineering and Design

基  金:山西省基础研究计划基金项目(20210302123018)。

摘  要:针对现有检测方法缺乏数据连续性和完整性的建模能力、难以提取API调用序列的全局特征,且对API行为语义表示抽取单一等问题,提出一种基于变分自编码器与API行为特征抽取的恶意软件检测方法。通过词嵌入将调用函数表示为语义稠密向量;基于变分自编码器架构,学习数据的潜在状态表示,完成对恶意软件全局特征和模式的提取;采用多层卷积神经网络,抽取不同粒度调用子序列的行为语义特征,同时统计调用频率,获取API使用权重信息;综合上述特征进行恶意软件检测。实验结果表明,该方法在阿里云数据集上达到了97.81%的良/恶性检测精度和93.74%的多分类精度,验证了方法的有效性。An approach to malware detection based on variational autoencoder and extraction of API behavioral features was proposed to address issues such as the lack of modeling capability for data continuity and integrity in existing detection methods,the difficulty in extracting global characteristics of API call sequences,and the extraction of singular representations of API behavioral semantics.Function calls were represented as semantic dense vectors through word embeddings.The variational autoencoder architecture was employed to learn latent state representations of data,achieving extraction of global features and patterns of malicious software.A multi-layer convolutional neural network was utilized to extract behavioral semantic features of different granularity call subsequences,concurrently calculating call frequencies to obtain API usage weight information.Integration of the aforementioned features was conducted for malware detection.Experimental results demonstrate that this method achieves 97.81%accuracy in benign/malicious detection and 93.74%in multi-class classification on the Alibaba Cloud dataset,validating the effectiveness of the approach.

关 键 词:恶意软件检测 变分自编码器 多层卷积神经网络 序列信息 行为语义 频率信息 特征融合 

分 类 号:TP309.5[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象