检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:霍跃华[1,2] 吴文昊[1] 赵法起 王强 HUO Yuehua;WU Wenhao;ZHAO Faqi;WANG Qiang(School of Mechanical Electronic&Information Engineering,China University of Mining and Technology-Beijing,Beijing 100083,China;School of Network and Information Center,China University of Mining and Technology-Beijing,Beijing 100083,China;Institute of Information Engineering,Chinese Academy of Science,Beijing 100084,China;School of Cyber Security,University of Chinese Academy of Sciences,Beijing 100049,China)
机构地区:[1]中国矿业大学(北京)机电与信息工程学院,北京100083 [2]中国矿业大学(北京)网络与信息中心,北京100083 [3]中国科学院信息工程研究所,北京100084 [4]中国科学院大学网络空间安全学院,北京100049
出 处:《西安电子科技大学学报》2023年第4期139-147,共9页Journal of Xidian University
基 金:信息系统安全技术重点实验室基金资助项目(CNKLSTISS-6142111190501)。
摘 要:针对基于机器学习的传输层安全协议加密恶意流量检测方法对标注样本依赖度高的问题,提出了一种基于半监督学习的传输层安全协议加密恶意流量检测方法。在少量标注样本的情况下,利用协同训练策略协同加密流量的两个视图,通过引入无标注样本进行训练,扩大样本集,进而减少对标注样本的依赖。首先,提取加密流量特征中独立性强的流元数据特征和证书特征,并分别构建协同训练的两个视图。其次,针对两个视图分别构建XGBoost分类器和随机森林分类器。最后,通过协同训练策略协同两个分类器构成多视图协同训练分类器检测模型,利用小规模标注样本和大量无标注样本进行模型训练。在公开数据集上,模型准确率达到了99.17%,召回率达到了98.54%,误报率低于0.18%。实验结果表明,在小规模标注样本的条件下,能够有效降低对标注样本依赖度。Aiming at the problem of high dependence on labeled samples in machine learning-based malicious traffic detection methods for transport layer security protocol encryption,a semi-supervised learning-based malicious traffic detection method for transport layer security protocol encryption is proposed.With only a small number of labeled samples,the co-training strategy is utilized for the first time to joint two views of the encrypted traffic,and the training is performed by introducing unlabeled samples to expand the sample set and thereby to reduce the dependence on labeled samples.First,the flow metadata features with strong independence and certificate features in encrypted traffic features are extracted to construct each view for collaborative training,respectively.Second,the XGBoost classifier and random forest classifier are constructed for each view respectively.Finally,the two classifiers are collaboratively trained to form a multi-view co-training classifier detection model through the co-training strategy,with the model trained using a small number of labeled samples and a large number of unlabeled samples.The model achieves an accuracy rate of 99.17%,a recall rate of 98.54%,and a false positive rate of less than 0.18%on the public dataset.Experimental results show that the proposed method can effectively reduce the dependence on labeled samples under the condition of a small number of labeled samples.
关 键 词:协同训练 传输层安全协议 多视图 特征选择 半监督学习
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222