基于强化学习的自动化Windows域渗透方法被引量：2

Automated Windows domain penetration method based on reinforcement learning

作　　者：占力戈沙乐天[1,2] 肖甫[1,2] 董建阔张品昌 ZHAN Lige;SHA Letian;XIAO Fu;DONG Jiankuo;ZHANG Pinchang(College of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Jiangsu Provincial Key Laboratory of Wireless Sensor Network High Technology Research,Nanjing 210023,China)

机构地区：[1]南京邮电大学计算机学院,江苏南京210023 [2]江苏省无线传感网高技术研究重点实验室,江苏南京210023

出　　处：《网络与信息安全学报》2023年第4期104-120,共17页Chinese Journal of Network and Information Security

基　　金：国家重点研发计划(2018YFB0803400);国家杰出青年科学基金(62125203);国家自然科学基金(62072253)。

摘　　要：Windows域为用户之间的资源共享及信息交互提供统一的系统服务,在便利内网管理的同时带来了巨大的安全隐患。近年来,针对域控制器的各式攻击层出不穷,实现自动化渗透能够灵活检测Windows域中存在的漏洞威胁,保障办公网络安全稳定地持久运行,其核心是高效挖掘环境内可行的攻击路径。为此,将渗透测试过程进行强化学习建模,通过智能体与域环境的真实交互发现漏洞组合,进而验证有效的攻击序列;基于主机对渗透进程的贡献差异,减少强化学习模型中非必要的状态与动作,优化路径选择策略,提升实际攻击效率;使用状态动作删减、探索策略优化的Q学习算法筛选最优攻击路径,自动验证域环境中所有可能的安全隐患,为域管理员提供防护依据。实验针对典型内网业务场景展开测试,从生成的13种高效攻击路径中筛选最优路径,通过与相关研究成果对比,突出了所提方法在域控权限获取、主机权限获取、攻击步长、收敛性以及时间代价等方面的性能优化效果。Windows domain provides a unified system service for resource sharing and information interaction among users.However,this also introduces significant security risks while facilitating intranet management.In recent years,intranet attacks targeting domain controllers have become increasingly prevalent,necessitating automated penetration testing to detect vulnerabilities and ensure the ongoing maintenance of office network operations.Then efficient identification of attack paths within the domain environment is crucial.The penetration process was first modeled using reinforcement learning,and attack paths were then discovered and verified through the interaction of the model with the domain environment.Furthermore,unnecessary states in the reinforcement learning model were trimmed based on the contribution differences of hosts to the penetration process,aiming to optimize the path selection strategy and improve the actual attack efficiency.The Q-learning algorithms with solution space refinement and exploration policy optimization were utilized to filter the optimal attack path.By employing this method,all security threats in the domain can be automatically verified,providing a valuable protection basis for domain administrators.Experiments were conducted on typical Windows domain scenarios,and the results show that the optimal path is selected from the thirteen efficient paths generated by the proposed method,while also providing better performance optimization in terms of domain controller intrusion,domain host intrusion,attack steps,convergence,and time cost compared to other approaches.

关键词：Windows域渗透测试强化学习攻击路径

分类号：TP393.08[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的自动化Windows域渗透方法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的自动化Windows域渗透方法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于强化学习的自动化Windows域渗透方法被引量：2