基于符号执行优化的PDF恶意指标提取技术  

PDF Malicious Indicators Extraction Technique Based on Improved Symbolic Execution

在线阅读下载全文

作  者:宋恩舟 胡涛[1] 伊鹏[1] 王文博[1] SONG Enzhou;HU Tao;YI Peng;WANG Wenbo(National Digital Switching System Engineering Technological R&D Center,Zhengzhou 450001,China)

机构地区:[1]国家数字交换系统工程技术研究中心,郑州450001

出  处:《计算机科学》2024年第7期389-396,共8页Computer Science

基  金:国家自然科学基金面上项目(62176264)。

摘  要:恶意PDF文档是APT组织常用的攻击方法,提取分析其内嵌JavaScript代码指标是判定文档恶意性的重要手段,然而攻击者可以采取高度混淆、虚拟机与沙箱检测等逃逸方法。因此,文中创新性地将符号执行方法用于PDF指标提取,提出了一种基于符号执行优化的PDF恶意指标提取技术,并实现了由代码解析、符号执行和指标提取3个模块组成的指标提取系统SYMBPDF。在代码解析模块中实现内嵌JavaScript代码提取与重组。在符号执行模块中设计代码改写方法,通过强制分支转移提高符号执行的代码覆盖率;设计并发策略和两种约束求解优化方法,以提高系统执行效率。在指标提取模块中实现恶意指标整合与记录。对1 271个恶意样本进行了指标提取与评估,指标提取成功率为92.2%,有效性为91.7%,代码覆盖率较优化前提升8.5%,系统性能较优化前提升32.3%。The malicious PDF document is a common attack method used by APT organizations.Analyzing extracted indicators of embedded JavaScript code is an important means to determine the maliciousness of the documents.However,attackers can adopt high obfuscation,sandbox detection and other escape methods to interfere with analysis.Therefore,this paper innovatively applies symbolic execution method to PDF indicator extraction.We propose a PDF malicious indicator extraction technique based on improved symbolic execution and implement SYMBPDF,an indicator extraction system consisting of three modules:code parsing,symbolic execution and indicator extraction.In the code parsing module,we implement extraction and reorganization of inline Javascript code.In the symbolic execution module,we design the code rewriting method to force branch shifting,resulting in improving the code coverage of symbolic execution.We also design a concurrency strategy and two constraint solving optimization methods to improve the efficiency.In the indicator extraction module,we realize integration and recording of malicious indicators.In this paper,1271 malicious samples are extracted and evaluated.The success rate of indicator extraction is 92.2%,the indicator effectiveness is 91.7%,the code coverage is 8.5%higher and the system performance is 32.3%higher than that of before optimization.

关 键 词:恶意文档 JAVASCRIPT代码 指标提取 符号执行 代码改写 约束求解优化 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象