大语言模型辅助的网络协议模糊测试  

LLM-based Fuzzing for Network Protocols

在线阅读下载全文

作  者:颜晨 张翼 龚汉文 薛吟兴 郭燕 YAN Chen;ZHANG Yi;GONG Hanwen;XUE Yinxing;GUO Yan(Suzhou Institute for Advanced Study,University of Science and Technology of China,Suzhou 215123,China;School of Software Engineering,University of Science and Technology of China,Hefei 230026,China;School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China)

机构地区:[1]中国科学技术大学苏州高等研究院,江苏苏州215123 [2]中国科学技术大学软件学院,合肥230026 [3]中国科学技术大学计算机科学与技术学院,合肥230027

出  处:《小型微型计算机系统》2025年第2期403-409,共7页Journal of Chinese Computer Systems

基  金:安徽省科技重大专项项目(202103a05020009)资助;国家重点研发计划项目(2023YFF0612300)资助。

摘  要:协议安全是保证互联网应用安全的基础,而模糊测试是验证协议安全的重要手段.协议测试的难点在于必须严格按照协议RFC中规定的结构和顺序生成数据包.为了生成能够满足协议需要的数据包,现有方法通常将一组记录的数据序列作为种子,但所记录的数据序列往往存在数量和多样性不足等问题,难以覆盖协议状态,而种子随机变异后的数据极有可能再次失效.为解决这一难题,本文探索了基于大语言模型(LLMs)进行网络协议模糊测试的方法,大语言模型充分理解了包括RFC协议在内的大量协议文本信息,从而获得了理解协议和生成测试所需的用例的能力.本文从状态获取、基于状态的种子生成和定向策略变异三方面,基于AFLNET探索了使用LLM进行网络协议模糊测试的方法 LLMAFL.为测试LLMAFL的效果,本文在ProfuzzBench中针对多项协议的测试,从代码覆盖和状态覆盖两方面,与当前领先的AFLNET和CHATAFL进行了对比,结果表明,相同测试时间内,LLMAFL的代码覆盖和状态覆盖相比AFLNET均有提升,部分协议中的效果也大幅超过了CHATAFL.Protocol security is the basis for ensuring the security of Internet applications,and fuzzing is an important means to verify protocol security.The difficulty of protocol testing is that data packets must be generated strictly in accordance with the structure and sequence specified in the protocol RFC.In order to generate data packets that can meet the needs of the protocol,existing methods usually use a set of recorded data sequences as seeds.However,the recorded data sequences often have problems such as insufficient quantity and diversity,making it difficult to cover enough protocol states.After the seeds are randomly mutated,data is very likely to become invalid again.In order to solve this problem,this paper explores the method of network protocol fuzzing based on large language models(LLMs).LLMs are trained with large amount of technical texts including RFC documents,thus obtains the ability to understand protocols and generate test cases.This paper proposes LLMAFL,a method of using LLM for network protocol fuzzing based on AFLNET,from three aspects:state acquisition,state-based seed generation and directed strategy mutation.In order to test the effect of LLMAFL,we test multiple protocols in ProfuzzBench,and compare it with the current leading AFLNET and CHATAFL from aspects of code coverage and state coverage.The results show that within the same testing time,LLMAFL′s code coverage and state coverage exceed AFLNet.And in some protocols,LLMAFL also exceeds CHATAFL by a large margin.

关 键 词:网络协议模糊测试 大语言模型 种子生成 定向策略变异 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象