检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王震 李赛飞[1] 张丽杰 WANG Zhen;LI Saifei;ZHANG Lijie(School of Information Science&Technology,Southwest Jiaotong University,Chengdu Sichuan 611756,China;Information Technology Center,Norla Institute of Technical Physics,Chengdu Sichuan 610041,China)
机构地区:[1]西南交通大学信息科学与技术学院,四川成都611756 [2]北方激光研究院有限公司信息技术中心,四川成都610041
出 处:《信息安全与通信保密》2022年第8期71-82,共12页Information Security and Communications Privacy
基 金:四川省科技计划项目(No.2021YJ0372);四川省重大科技专项项目(No.2019ZDZX0007,No.2021YFQ0056)。
摘 要:自动化红队测试是当前研究的热点问题,旨在更加高效、低成本和可重复地进行网络安全评估。自动攻击计划生成是自动化红队测试的重要部分,目的是替代安全专家进行攻击计划过程。将强化学习与红队测试问题相结合,将红队测试过程建模为马尔可夫决策模型,利用基于策略(Policy Gradient)和基于价值(Q-Learning、SARSA和Deep Q Network)的强化学习算法,在仿真环境中训练代理完成攻击计划的构建;在实验环境中验证攻击计划的可行性和适应性。仿真和实验结果表明,PG算法只学习到非最优攻击计划,收敛速度慢;Q-Learning、SARSA和DQN算法能学习到最优攻击计划,Q-Learning算法收敛速度最快,SARSA算法次之,DQN算法最慢;利用强化学习算法构建的攻击计划具有较好的可行性和适应性。Automated red teaming testing is a hot issue of current research aimed at more efficient,cost-effective and repeatable cybersecurity assessments.The construction of automated attack plans is an important part of automated red teaming testing,which is designed to replace the attack decision-making process by security experts.In this paper,reinforcement learning is combined with red teaming testing,and the red teaming testing process is modeled as a Markov decision process model,the agent is trained in simulated environment by policy-based and value-based reinforcement learning algorithms;and the feasibility and adaptability of the attack plan are verified in the experimental environment.The simulation and experimental results indicate that the PG algorithm can only learn the non-optimal attack plan,and the convergence speed is slow;the Q-Learning,SARSA and DQN algorithms can learn the optimal attack plan,the Q-Learning algorithm has the fastest convergence speed,followed by the SARSA algorithm,and the DQN algorithm is the slowest;the attack plan constructed by the reinforcement learning algorithm is feasible and adaptive.
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33