基于强化学习算法的网络混淆攻击防御仿真

Simulation of Network Confusion Attack Defense Based on Reinforcement Learning Algorithm

作　　者：张国章谭强强[2] ZHANG Guo-zhang;TAN Qiang-qiang(Guangdong Construction Engineering Group Holdings Co.,Ltd.guangzhou,510110,China;Guangdong University of Technology,guangzhou,510006,China)

机构地区：[1]广东省建筑工程集团控股有限公司,广东广州510110 [2]广东工业大学,广东广州510006

出　　处：《计算机仿真》2024年第12期462-466,共5页Computer Simulation

摘　　要：增加网络传输数据的稳定性是当下网络发展的方向,如何提升网络混淆攻击防御的成功率,提高信息输送、用户隐私等的安全性具有重要意义。为提高防御模型的响应计算时效性,本文将MDP决策算法与RL强化学习网络有机结合,建立了改进强化学习网络混淆攻击防御模型。该模型首先对折扣因子进行强化,通过奖励数据加权计算,在增加智能体动作的基础上,对传统MDP决策算法进行优化改进;然后通过提升动作管理的即时奖励,在持久奖励为目标的基础上,采用MDP决策出最优动作,同时利用损失模型解决设备增长的问题;接着采用MCM算法进行网络求解,通过系统采样处理与梯度函数反推的方式,使得估计期望回报趋于状态价值,以提升改进RL防御模型的响应时效性;最后设置SSID网络混淆攻击对系统模,通过游戏攻击模型验证防御算法的成功率。网络数据传输仿真实验结果表明即使随着传输并发数据容量与数量的增加,本文改进防御模型的响应时效性较高,较其他三类传统防御模型相比,本文模型的开销时间增量最小;混淆攻击网络防御仿真实验的结果反映出,与无防护网络模型相比,改进RL网络防御模型的防御成功率平均增加了41.10%,且较MAC、DAC与MD5三类网络防御模型相比,RL防御网络模型成功率整体增长了12.95%。即本文提出的改进强化学习网络防御算法的响应速率快、防御成功率高,在网络信息安全研究中具有较为重要的仿真分析价值。Increasing the stability of network transmission data is the direction of current network development.How to improve the success rate of network confusion attack defense and improve the security of information transmission and user privacy is of great significance.In order to improve the timeliness of the response calculation of the defense model,this paper combines MDP decision algorithm with RL reinforcement learning network,and establishes an improved reinforcement learning network confusion attack defense model.Firstly,the discount factor is strengthened,and the traditional MDP decision algorithm is optimized and improved on the basis of increasing agent actions through weighted calculation of reward data;then,by improving the immediate reward of action management,the optimal action is decided by MDP on the basis of lasting reward,and the loss model is used to solve the problem of equipment growth;Then MCM algorithm is used to solve the network,and the estimated expected return tends to the state value through system sampling processing and gradient function backstepping,so as to improve the response timeliness of the improved RL defense model.Finally,the SSID network confusion attack model is set,and the success rate of the defense algorithm is verified by the game attack model.The simulation results of network data transmission show that even with the increase of the capacity and quantity of concurrent data transmission,the improved defense model in this paper has a higher response timeliness,and compared with other three traditional defense models,the overhead time increment of this model is the smallest;The simulation results of network defense against confusion attack show thatcompared with the unprotected network model,the success rate of the improved RL network defense model increasesby 41. 10% on average,and compared with the MAC,DAC and MD5 network defense models,the overall success rateof the RL defense network model increases by 12. 95%.That is to say,the improved reinforcement learning networkdefe

关键词：强化算法网络安全混淆攻击

分类号：TP391.9[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习算法的网络混淆攻击防御仿真

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习算法的网络混淆攻击防御仿真

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索