基于底层虚拟机的标识符混淆方法  被引量:1

Identifier obfuscation method based on low level virtual machine

在线阅读下载全文

作  者:田大江 李成扬 黄天波 文伟平[1] TIAN Dajiang;LI Chengyang;HUANG Tianbo;WEN Weiping(School of Software and Microelectronics,Peking University,Beijing 102600,China)

机构地区:[1]北京大学软件与微电子学院,北京102600

出  处:《计算机应用》2022年第8期2540-2547,共8页journal of Computer Applications

基  金:华为−北京大学校企合作项目(2020001763)。

摘  要:针对现有代码混淆仅限于某一特定编程语言或某一平台,并不具有广泛性和通用性,以及控制流混淆和数据混淆会引入额外开销的问题,提出一种基于底层虚拟机(LLVM)的标识符混淆方法。该方法实现了4种标识符混淆算法,包括随机标识符算法、重载归纳算法、异常标识符算法以及高频词替换算法,同时结合这些算法,设计新的混合混淆算法。所提混淆方法首先在前端编译得到的中间文件中候选出符合混淆条件的函数名,然后使用具体的混淆算法对这些函数名进行处理,最后使用具体的编译后端将混淆后的文件转换为二进制文件。基于LLVM的标识符混淆方法适用于LLVM支持的语言,不影响程序正常功能,且针对不同的编程语言,时间开销在20%内,空间开销几乎无增加;同时程序的平均混淆比率在77.5%,且相较于单一的替换算法和重载算法,提出的混合标识符算法理论分析上可以提供更强的隐蔽性。实验结果表明,所提方法具有性能开销小、隐蔽性强、通用性广的特点。Most of the existing code obfuscation solutions are limited to a specific programming language or a platform,which are not widespread and general.Moreover,control flow obfuscation and data obfuscation introduce additional overhead.Aiming at the above problems,an identifier obfuscation method was proposed based on Low Level Virtual Machine(LLVM).Four identifier obfuscation algorithms were implemented in the method,including random identifier algorithm,overload induction algorithm,abnormal identifier algorithm,and high-frequency word replacement algorithm.At the same time,a new hybrid obfuscation algorithm was designed by combining these algorithms.In the proposed method,firstly,in the intermediate files compiled by the front-ends,the function names,which met the obfuscation criteria,were selected.Secondly,these function names were processed by using specific obfuscation algorithms.Finally,the obfuscated files were transformed into binary files by using specific compilation back-ends.The identifier obfuscation method based on LLVM is suitable for the languages supported by LLVM and does not affect the normal functions of the program.For different programming languages,the time overhead is within 20%and the space overhead hardly increases.At the same time,the average confusion ratio of the program is 77.5%,and compared with the single replacement algorithm and overload algorithm,the proposed mixed identifier algorithm can provide stronger concealment in theoretical analysis.Experimental results show that the proposed method has the characteristics of low-performance overhead,strong concealment,and wide versatility.

关 键 词:软件保护 代码混淆 标识符混淆 底层虚拟机 混淆方法 

分 类 号:TP312[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象