一种基于复合域的国密SM4算法快速软件实现方法  被引量:5

A Fast Software Implementation of SM4 Based on Composite Fields

在线阅读下载全文

作  者:陈晨[1,2] 郭华 王闯[3] 刘源灏 刘建伟[1] CHEN Chen;GUO Hua;WANG Chuang;LIU Yuan-Hao;LIU Jian-Wei(Key Laboratory of Aerospace Network Security(Ministry of Industry and Information Technology),Beihang University,Beijing 100191,China;State Key Laboratory of Cryptology,Beijing 100878,China;School of Computer Science,National University of Defense Technology,Changsha 410073,China)

机构地区:[1]北京航空航天大学,空天网络安全工业和信息化部重点实验室,北京100191 [2]密码科学技术国家重点实验室,北京100878 [3]国防科技大学计算机学院,长沙410073

出  处:《密码学报》2023年第2期289-305,共17页Journal of Cryptologic Research

基  金:北京市自然科学基金(4202037);国家自然科学基金(61972018)。

摘  要:成为ISO/IEC国际标准算法后,SM4的性能受到更多关注.目前针对SM4算法实现效率提升的方法主要集中在缩短S盒的运算时间,其中采用复合域实现的方法大都基于AES算法实现的复合域,而在GF((2^(4))^(2))上鲜有针对SM4算法软件实现的复合域被提出.本文首次在GF((2^(4))^(2))上找到了一个针对SM4算法S盒软件实现的复合域,给出一种基于复合域的SM4算法快速软件实现方法,使用穷举搜索和数学分析优化了算法S盒的复合域数学构造,构建了同构映射矩阵及其最小化目标函数,仅使用175个门函数就完成了S盒运算,平均每个输出比特占用22个门函数.基于比特切片技术,利用扩展指令集AVX2实现了SM4算法256组消息的并行化加密.每字节加解密平均耗时仅6.5个时钟周期.对硬件依赖程度低,经测试在Intel i5、Intel i7和AMD R7环境下均能显著提升SM4算法的计算效率,对有相似S盒结构的密码算法快速软件实现具有重要的参考价值.Since becoming an ISO/IEC international standard,the efficiency of SM4 has attracted more attention than before.At present,the methods to improve the efficiency of SM4 algorithm are mainly focused on shortening the operation time of S-box.Most of the methods using composite field are based on AES,while few composite fields for SM4 software implementation are proposed over GF((2^(4))^(2)).This paper presents a composite field on GF((2^(4))^(2))for the software implementation of S-box in SM4,gives a fast software implementation of SM4 based on composite field,optimizes the mathematical construction of the S-box composite field by exhaustive search and mathematical analysis,constructs isomorphic mapping matrix and its minimum objective function,and completes Sbox operation by only 175 gate functions with each output bit occupying an average of 22 gate functions.Based on bit-slicing technics,the parallel encryption of 256 groups of messages in SM4 algorithm is realized by using the extended instruction set AVX2.The average encryption and decryption time per byte is only 6.5 clock cycles.The proposed method has a low dependence on hardware.The testing experiments show a significant improvement of the computational efficiency of SM4 on Intel i5,Intel i7 and AMD R7 platforms.The proposed method can be a good reference for fast software implementation of other cryptographic algorithms with S-box structures like that in SM4.

关 键 词:SM4算法 S盒 复合域 比特切片 AVX2扩展指令集 

分 类 号:TP309.7[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象