基于相似度的DGA域名检测方法  

DGA Domain Name Detection Method Based on Similarity

在线阅读下载全文

作  者:孙海栋 刘万平[1] 黄东 SUN Haidong;LIU Wanping;HUANG Dong(College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China;Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education,Guizhou University,Guiyang 550025,China)

机构地区:[1]重庆理工大学计算机科学与工程学院,重庆400054 [2]贵州大学现代制造技术教育部重点实验室,贵阳550025

出  处:《计算机科学》2023年第S01期740-745,共6页Computer Science

基  金:重庆市自然科学基金(cstc2021jcyj-msxmX0594);重庆市教委科学技术研究项目(KJQN201901101)。

摘  要:僵尸网络使互联网面临着巨大的威胁。依托僵尸网络的分布式拒绝服务攻击和垃圾邮件等恶意行为能给攻击目标造成巨大损失,其通信主要基于DGA域名,因此需要对域名进行检测。现有检测方法主要基于字符编码提取域名特征,再利用神经网络进行分类。由于仅考虑了字符特征,因此对DGA域名检测的准确率往往不高。为准确检测出DGA域名,提出了域名字符相似度和域名节点相似度的计算方法,并依据相似度对DGA域名进行检测。首先构建以双向门控循环单元神经网络为基学习器的模型,从数据集中筛选出具有明显特征的DGA域名;然后,使用循环神经网络对被筛选出的DGA域名进行聚类;最后,计算数据集中待检测域名与DGA域名的相似度,将相似度大于阈值的域名分类为DGA域名。实验结果表明,该方法在检测含多类DGA域名的数据集时准确率可达到99.03%。Botnets expose the Internet to a huge threat.Malicious behaviors such as distributed denial of service attacks and spam relying on botnets can cause great losses to the attack targets.The communication of the botnet is mainly based on the DGA domain name,so the domain name needs to be detected.Existing detection methods are mainly based on character encoding to extract domain name features,and then use neural networks for classification.Since only character features are considered,the detection accuracy of malicious domain names is often not high.In order to accurately detect DGA domain names,a calculation method of domain name character similarity and domain name node similarity is proposed,and malicious domain names are detected according to the similarity.First,a model based on a bidirectional gated recurrent unit neural network is constructed to screen out the algorithm with obvious features in the data set to generate domain names.Then using the recurrent neural network to cluster the selected malicious domain names,and finally calculate the similarity between the domain name to be detected in the dataset and the domain names which are malicious,and classify the domain name with the similarity greater than the threshold as the malicious domain name.Experimental results show that the method has an accuracy of 99.03%in detecting datasets containing multi-category malicious names.

关 键 词:DGA域名 僵尸网络 域名检测 相似度计算 门控循环单元 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象