基于半监督生成对抗网络的恶意代码家族分类实现  被引量:3

Realization of malicious code family classification based on semi-supervised generative adversarial network

在线阅读下载全文

作  者:王栋 杨珂 玄佳兴 韩雨桐 赵丽花 王旭仁[4] WANG Dong;YANG Ke;XUAN Jia-xing;HAN Yu-tong;ZHAO Li-hua;WANG Xu-ren(State Grid Electronic Commerce Co.,Ltd.(State Grid Xiong’an Financial Technology Group Co.,Ltd.),Beijing 100053;Blockchain Technology Laboratory,State Grid Corporation of China,Beijing 100053;Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093;College of Information Engineering,Capital Normal University,Beijing 100048,China)

机构地区:[1]国网电子商务有限公司(国网雄安金融科技集团有限公司),北京100053 [2]国家电网有限公司区块链技术实验室,北京100053 [3]中国科学院信息工程研究所,北京100093 [4]首都师范大学信息工程学院,北京100048

出  处:《计算机工程与科学》2022年第5期826-833,共8页Computer Engineering & Science

基  金:国家自然科学基金(61872252);国家重点研发计划项目(2018YFB0805005);国网电商公司科技项目(2500/2020-72001B)。

摘  要:随着互联网的发展,恶意代码呈现海量化与多态化的趋势,恶意代码家族分类是网络空间安全面临的挑战之一。将半监督生成对抗网络与深度卷积学习网络相结合,构建半监督深度卷积生成对抗网络,提出了一种恶意代码家族分类模型,通过恶意代码家族特征分析,对恶意代码进行特征提取,转化为一维灰度图像;然后基于一维卷积神经网络1D-CNN,构建半监督生成对抗网络SGAN,形成恶意代码家族分类模型SGAN-CNN。从特征提取优化、半监督生成对抗训练算法优化等方面进行恶意代码家族分类能力提升。为了验证SGAN-CNN模型的分类效果,在Microsoft Malware Classification Challenge数据集上进行实验。5折交叉验证测试显示,本文提出的模型在样本标注标签占80%的情况下,分类的平均准确率达到98.81%;在样本标注标签仅有20%的情况下,分类的平均准确率达到98.01%,取得了较好的分类效果。在小样本数量情况下,也能取得不错的分类准确率。With the development of Internet,malicious code tend to be massive and polymorphic.The classification of malicious code family is one of the challenges of cyber security.Combining the semi supervised generation network with the deep convolutional neural network,a multi-family malicious code classification model is proposed.Taking the gray image of malicious codes as the feature,based on the efficient one-dimensional convolutional neural network(1 D-CNN),using the semi-supervised generative adversarial network(SGAN),an efficient and accurate malicious code family classification model is constructed as SGAN-CNN,which can improve the malicious code classification ability from aspects of efficient feature extraction and SGAN optimization.In order to verify the classification ability of the model,experiments are carried out on the Microsoft malware classification challenge data set.5-fold cross-validation shows that the proposed model achieves 98.81% of the average accuracy of the test set with 80% of the tag rate,98.01% of the average accuracy of the test set with 20% of the tag rate,and achieves better experimental results.In the case of small samples,it can also achieve good classification accuracy.

关 键 词:深度学习 一维卷积神经网络 半监督学习 生成对抗网络 恶意代码分类 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象