检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵宏[1] 常兆斌 王伟杰 ZHAO Hong;CHANG Zhao-bin;WANG Wei-jie(School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China)
机构地区:[1]兰州理工大学计算机与通信学院,甘肃兰州730050
出 处:《微电子学与计算机》2020年第5期13-17,共5页Microelectronics & Computer
基 金:国家自然科学基金项目(51668043,61262016);赛尔网络下一代互联网技术创新项目(NGII20160311,NGII20160112);兰州理工大学学生科技创新基金(KC2019ZR016)。
摘 要:针对目前恶意域名检测方法特征提取过程复杂和检测准确率不高的问题,提出一种基于深度自编码和决策树(Deep Auto Encoder and Decision Tree, DAE-DT)的恶意域名检测算法.该算法首先将每一域名按照域名词法组成与结构等属性进行特征映射,并进行正则化处理;然后将正则化处理后的无标签域名数据随机置0作为模型的输入,域名字符统计特征作为输出,构造深度自编码网络模型.并通过计算模型输出值与未处理数据之间的重构误差,实现各层参数与权值的优化,以增强模型的鲁棒性;最后依据提取的域名字符统计特征构造恶意域名判定的决策树.通过在Alexa和Malware domain list等标准数据集上进行测试.实验结果表明,该模型的检测准确率、精确率、假阴性率和假阳性率值分别为95.21%、94.17%、2.41%和3.63%.Aiming at the problem that the existing malicious domain name detection methods are not effective enough in performance of accuracy rate and the process of feature extraction, a malicious domain name detection algorithm based on deep auto-encoder and decision tree(DAE-DT) is proposed. According to lexical composition and structure of domain name, each domain name is firstly mapped into the feature space and it is normalized. Then the normalized unlabeled domain names are randomly set to 0 as the input of the model, and the statistical features of domain name are used to as the output to construct the deep auto-encoder network model, and the reconstruction error of the unprocessed data and output data is computed to achieve the purpose of optimizing the parameters and weights so that the model is more robust. Finally, a decision tree for malicious domain name detection is constructed based on the statistical features of domain name. In the experiments on Alexa and malware domain list, the proposed detection algorithm yield an accuracy rate of 95.21%, a precision rate of 94.17%, a false negative rate of 2.41%, and a false positive rate of 3.63%.
关 键 词:恶意域名检测 深度自编码 决策树 域名统计特征 重构误差
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222