基于Q-Learning的自动入侵响应决策方法  被引量:4

Automatic Intrusion Response Decision-making Method Based on Q-Learning

在线阅读下载全文

作  者:刘璟 张玉臣 张红旗 LIU Jing;ZHANG Yuchen;ZHANG Hongqi(Department of Cryptogram Engineering,Information Engineering University of PLA,Zhengzhou 450001,China)

机构地区:[1]中国人民解放军战略支援部队信息工程大学密码工程学院,郑州450001

出  处:《信息网络安全》2021年第6期26-35,共10页Netinfo Security

基  金:国家重点研发计划[2016YFF0204002,2016YFF0204003];国家自然科学基金[61902427,61471344]。

摘  要:针对现有自动入侵响应决策自适应性差的问题,文章提出一种基于Q-Learning的自动入侵响应决策方法——Q-AIRD。Q-AIRD基于攻击图对网络攻防中的状态和动作进行形式化描述,通过引入攻击模式层识别不同能力的攻击者,从而做出有针对性的响应动作;针对入侵响应的特点,采用Softmax算法并通过引入安全阈值θ、稳定奖励因子μ和惩罚因子ν进行响应策略的选取;基于投票机制实现对策略的多响应目的评估,满足多响应目的的需求,在此基础上设计了基于Q-Learning的自动入侵响应决策算法。仿真实验表明,Q-AIRD具有很好的自适应性,能够实现及时、有效的入侵响应决策。Aiming at the problem of poor adaptability of existing automatic intrusion response decision-making,this paper proposes an automatic intrusion response decision-making method based on Q-Learning(Q-AIRD).Q-AIRD formalizes the states and actions of network attack and defense based on the attack graph,and introduces the attack mode layer to identify attackers with different abilities,so as to make more targeted response actions.According to the characteristics of intrusion response,the Softmax algorithm is adopted and the security thresholdθ,stable reward factorμand penalty factorνare introduced to select the response strategy.Based on the voting mechanism,the multi-response purpose evaluation of the strategy is realized to meet the needs of the multi-response purpose.On this basis,an automatic intrusion response decision algorithm based on Q-Learning is designed.The simulation results show that Q-AIRD has good adaptability and can realize timely and effective intrusion response decision-making.

关 键 词:强化学习 自动入侵响应 Softmax算法 多目标决策 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象