检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Huang Wanwei Yuan Bo Wang Sunan Ding Yi Li Yuhua
机构地区:[1]College of Software Engineering,Zhengzhou University of Light Industry,Zhengzhou 450001,China [2]The Third Construction Co.,Ltd of China CREC Railway Electrification Engineering Group,Zhengzhou 450052,China [3]Electronic and Communication Engineering,Shenzhen Polytechnic University,Shenzhen 518055,China
出 处:《China Communications》2024年第9期262-275,共14页中国通信(英文版)
基 金:supported by the Major Science and Technology Programs in Henan Province(No.241100210100);The Project of Science and Technology in Henan Province(No.242102211068,No.232102210078);The Key Field Special Project of Guangdong Province(No.2021ZDZX1098);The China University Research Innovation Fund(No.2021FNB3001,No.2022IT020);Shenzhen Science and Technology Innovation Commission Stable Support Plan(No.20231128083944001)。
摘 要:Existing researches on cyber attackdefense analysis have typically adopted stochastic game theory to model the problem for solutions,but the assumption of complete rationality is used in modeling,ignoring the information opacity in practical attack and defense scenarios,and the model and method lack accuracy.To such problem,we investigate network defense policy methods under finite rationality constraints and propose network defense policy selection algorithm based on deep reinforcement learning.Based on graph theoretical methods,we transform the decision-making problem into a path optimization problem,and use a compression method based on service node to map the network state.On this basis,we improve the A3C algorithm and design the DefenseA3C defense policy selection algorithm with online learning capability.The experimental results show that the model and method proposed in this paper can stably converge to a better network state after training,which is faster and more stable than the original A3C algorithm.Compared with the existing typical approaches,Defense-A3C is verified its advancement.
关 键 词:A3C cyber attack-defense analysis deep reinforcement learning stochastic game theory
分 类 号:TP393.09[自动化与计算机技术—计算机应用技术] O225[自动化与计算机技术—计算机科学与技术] TP18[理学—运筹学与控制论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.79