基于Jump-SBERT的二进制代码相似性检测技术研究  

Study on Binary Code Similarity Detection Based on Jump-SBERT

在线阅读下载全文

作  者:严尹彤 于璐 王泰彦 李宇薇 潘祖烈 YAN Yintong;YU Lu;WANG Taiyan;LI Yuwei;PAN Zulie(College of Electronic Engineering,National University of Defense Technology,Hefei 230037,China;Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation,Hefei 230037,China)

机构地区:[1]国防科技大学电子对抗学院,合肥230037 [2]网络空间安全态势感知与评估安徽省重点实验室,合肥230037

出  处:《计算机科学》2024年第5期355-362,共8页Computer Science

基  金:国家自然科学基金青年科学基金(62202484)。

摘  要:二进制代码相似性检测技术在不同的安全领域中有着重要的作用。针对现有的二进制代码相似性检测方法面临计算开销大且精度低、二进制函数语义信息识别不全面和评估数据集单一等问题,提出了一种基于Jump-SBERT的二进制代码相似性检测技术。Jump-SBERT有两个主要创新点,一是利用孪生网络构建SBERT网络结构,该网络结构能够在降低模型的计算开销的同时保持计算精度不变;二是引入了跳转识别机制,使Jump-SBERT可以学习到二进制函数的图结构信息,从而更加全面地捕获二进制函数的语义信息。实验结果表明,Jump-SBERT在小函数池(32个函数)中的识别准确率可达96.3%,在大函数池(10000个函数)中的识别准确率可达85.1%,比最先进(State-of-the-Art,SOTA)的方法高出36.13%,且Jump-SBERT在大规模二进制代码相似性检测中的表现更加稳定。消融实验表明,两个主要创新点对Jump-SBERT均有积极作用,其中,跳转识别机制的贡献最高可达9.11%。Binary code similarity detection technology plays an important role in different security fields.Aiming at the problems of the existing binary code similarity detection methods,such as high computational cost and low accuracy,incomplete semantic information recognition of binary function and single evaluation data set,a binary code similarity detection technique based on Jump-SBERT is proposed.Jump-SBERT has two main innovations.One is to use twin networks to build SBERT network structure,which can reduce the calculation cost of the model while keeping the calculation accuracy unchanged.The other is to introduce jump recognition mechanism,which enables Jump-SBERT to learn the graph structure information of binary functions.Thus,the semantic information of binary function can be captured more comprehensively.Experimental results show that the re-cognition accuracy of Jump-SBERT can reach 96.3%in the small function pool(32 functions)and 85.1%in the large function pool(10000 functions),which is 36.13%higher than state-of-the-art(SOTA)methods.Jump-SBERT is more stable in large-scale binary code similarity detection.Ablation experiments show that both of the two main innovation points have positive effects on Jump-SBERT,and the contribution of jump recognition mechanism is up to 9.11%.

关 键 词:二进制代码 相似性检测 语义信息 SBERT网络结构 跳转识别机制 

分 类 号:TP312[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象