利用深度强化学习的多阶段博弈网络拓扑欺骗防御方法

Multi-Stage Game-based Topology Deception Method Using Deep Reinforcement Learning

作　　者：何威振谭晶磊张帅[1] 程国振[1,2] 张帆郭云飞[1] HE Weizhen;TAN Jinglei;ZHANG Shuai;CHENG Guozhen;ZHANG Fan;GUO Yunfei(Institute of Information Technology Research,Information Engineering University,Zhengzhou 450001,China;Key Laboratory of Cyberspace Security,Ministry of Education,Zhengzhou 450001,China)

机构地区：[1]信息工程大学信息技术研究所,郑州450001 [2]网络空间安全教育部重点实验室,郑州450001

出　　处：《电子与信息学报》2024年第12期4422-4431,共10页Journal of Electronics & Information Technology

基　　金：河南省重大科技专项(221100211200)。

摘　　要：针对当前网络拓扑欺骗防御方法仅从空间维度进行决策,没有考虑云原生网络环境下如何进行时空多维度拓扑欺骗防御的问题,该文提出基于深度强化学习的多阶段Flipit博弈网络拓扑欺骗防御方法来混淆云原生网络中的侦察攻击。首先分析了云原生网络环境下的拓扑欺骗攻防模型,接着在引入折扣因子和转移概率的基础上,构建了基于Flipit的多阶段博弈网络拓扑欺骗防御模型。在分析博弈攻防策略的前提下,构建了基于深度强化学习的拓扑欺骗生成方法求解多阶段博弈模型的拓扑欺骗防御策略。最后,通过搭建实验环境,验证了所提方法能够有效建模分析云原生网络的拓扑欺骗攻防场景,且所提算法相比于其他算法具有明显的优势。Aiming at the problem that current network topology deception methods only make decisions in the spatial dimension without considering how to perform spatio-temporal multi-dimensional topology deception in cloud-native network environments,a multi-stage Flipit game topology deception method with deep reinforcement learning to obfuscate reconnaissance attacks in cloud-native networks.Firstly,the topology deception defense-offense model in cloud-native complex network environments is analyzed.Then,by introducing a discount factor and transition probabilities,a multi-stage game-based network topology deception model based on Flipit is constructed.Furthermore under the premise of analyzing the defense-offense strategies of game models,a topology deception generation method is developed based on deep reinforcement learning to solve the topology deception strategy of multi-stage game models.Finally,through experiments,it is demonstrated that the proposed method can effectively model and analyze the topology deception defense-offense scenarios in cloud-native networks.It is shown that the algorithm has significant advantages compared to other algorithms.

关键词：云原生网络拓扑欺骗多阶段Flipit博弈深度强化学习深度确定性策略梯度算法

分类号：TP393.08[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

利用深度强化学习的多阶段博弈网络拓扑欺骗防御方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

利用深度强化学习的多阶段博弈网络拓扑欺骗防御方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索