字段语义推断模型的二进制协议语义推理方法  

Semantic Inference Method of Binary Protocols Based on Field Semantic Inference Model

在线阅读下载全文

作  者:董姝岐 黄辑贤 粘镇泓 井靖[1] DONG Shuqi;HUANG Jixian;NIAN Zhenhong;JING Jing(Information Engineering University,Zhengzhou 450001,China)

机构地区:[1]信息工程大学,河南郑州450001

出  处:《信息工程大学学报》2025年第2期238-244,共7页Journal of Information Engineering University

摘  要:针对二进制协议逆向工程中字段语义推断准确性低且泛化能力弱的问题,提出一种基于softmax分类模型的字段语义推断模型(FSISC)的自动推断方法。首先,将收集到的协议数据,根据IP地址、端口号进行会话分组;其次,针对已知和未知协议字段本身、字段列上下文以及多序列行上下文3类特征,采用3种门控循环单元(GRU)进行特征提取;再次,将已知协议字段语义描述转换为嵌入向量,计算向量之间的余弦相似度,并根据字段描述的语义相似度使用k-means++算法进行聚类;最后,利用softmax分类模型对提取的特征和聚合后的语义类别进行类别映射,实现未知协议的自动化语义推断。实验结果显示,所提方法可有效提升对未知协议的泛化能力,实现4种协议的语义推断,与二进制协议逆向工程的自动字段语义推理方法(FSIBP)相比,语义推理准确率有所提升。Addressing the issues of low accuracy and weak generalization in field semantic inference during binary protocol reverse engineering,an automatic inference method based on the field semantic inference model for softmax classification model(FSISC)is proposed.Firstly,the collected protocol data are divided into sessions according to IP addresses and port numbers.Secondly,three kinds of gated recurrent unit(GRU)are used to extract features for known and unknown protocol field,field column context and multiple sequence row context.Thirdly,the semantic descriptions of known protocol fields are converted into embedding vectors,the cosine similarity between these vectors is calculated,and the k-means++algorithm is used to cluster according to the semantic similarity of field descriptions.Finally,the softmax classification model is employed to map the extracted features and the aggregated semantic categories to realize the automated semantic inference of unknown protocols.Experimental results demonstrate that the generalization ability of unknown protocols is effectively improved by using the proposed method,achieving semantic inference for four protocols.Compared with the automated field semantics inference method for binary protocol reverse engineering(FSIBP),the average accuracy of semantic inference is improved.

关 键 词:二进制协议逆向工程 深度学习 softmax分类模型 语义推断 门控循环单元 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象