机构地区:[1]复旦大学计算机科学技术学院,上海200433 [2]密码科学技术国家重点实验室,北京100036
出 处:《计算机学报》2024年第3期589-607,共19页Chinese Journal of Computers
基 金:国家重点研发计划基金(2022YFB2701600);密码科学技术国家重点实验室面上课题基金(MMKFKT202227);上海市科委技术标准基金(21DZ2200500);上海市协同创新基金(XTCX-KJ-2023-54);上海市科委区块链关键技术攻关专项基金(23511100300)资助
摘 要:量子计算技术的迅猛发展对现有的公钥密码体制造成了极大的威胁,为了抵抗量子计算的攻击,后量子密码成为当前密码学界的研究热点.目前,物联网的安全问题备受关注,ARM Cortex-M4作为低功耗嵌入式处理器,被广泛应用于物联网设备中,在其上部署后量子密码算法将为物联网设备的安全提供更加可靠的保障.CNTR和CTRU是我国学者提出的NTRU格基密钥封装方案,相比于基于LWE技术路线的格基密钥封装方案在安全性和其他性能上具有综合优势,并在我国密标委得到立项.本文工作首次在ARM Cortex-M4平台上高效紧凑地实现了CNTR和CTRU方案,充分利用单指令多数据(Single Instruction Multiple Data,SIMD)指令,调整运算结构和指令安排,优化核心的多项式运算,从而在算法实现速度和堆栈空间上进行全面优化升级.本文主要工作如下:本文首次在ARM Cortex-M4上实现耗时模块多项式中心二项分布采样,采样速度提升32.49%;使用混合基数论变换(Number Theoretic Transform,NTT)加速非NTT友好多项式乘法运算,充分利用浮点单元(Floating-Point Unit,FPU)寄存器,在NTT实现中采用层融合技术,最大化减少加载和存储等耗时指令使用,使得正向NTT和逆向NTT的速度分别提升84.24%、81.15%;通过NTT过程系数范围分析进行延迟约减,进而减少约减次数,并使用改进的Barrett约减和Montgomery约减技术实现降低约减汇编指令条数;使用循环展开技术实现多项式求逆,优化多项式求逆这一耗时过程,速度优化率为68.85%;针对解密过程中的非NTT友好素数模数多项式环乘法,采用多模数NTT和中国剩余定理(Chinese Remainder Theorem,CRT)结合的方法进行加速,完成解密过程96.26%的速度提升;使用空间复用的方法优化堆栈空间,CNTR和CTRU的堆栈空间分别减少了29.86%、28.17%.实验结果表明:提出的优化技术大幅提升了算法实现效率,与C参考实现相比,CNTR和CTRU的整�The rapid advancement of quantum computing technology poses an imminent and formidable threat to the foundations of our existing public key cryptography systems.As researchers grapple with the urgent need to safeguard our digital infrastructure against the imminent threat of quantum attacks,the concept of Post-Quantum Cryptography(PQC)has emerged as a dynamic and thriving field of study.By exploring innovative cryptographic techniques and mathematical constructs,researchers in the field of PQC strive to ensure the long-term security and resilience of our digital communication systems in the face of the impending quantum revolution.At present,the security issues of the Internet of Things(IoT)have attracted much attention.ARM Cortex-M4,as a low-power embedded processor,is widely used in IoT devices,thus deploying post-quantum cryptography algorithms on it will provide more reliable guarantees for the security of IoT devices.CNTR and CTRU are NTRU lattice-based Key Encapsulation Mechanisms(KEM)proposed by Chinese scholars.These schemes offer comprehensive advantages in terms of security and performance compared to lattice-based KEMs based on the Learning With Errors(LWE)technical route.They have received funding from the Cryptography Standardization Technical Committee(CSTC)in China.The primary focus of this research paper is to efficaciously and succinctly implement the CNTR and CTRU schemes on the ARM Cortex-M4 platform.Leveraging the power of Single Instruction Multiple Data(SIMD)instructions,strategically adjusting the operation structure,and optimizing the instruction arrangement,we have succeeded in significantly augmenting the speed and optimizing the stack usage of these algorithms.The main contributions of this paper include:Firstly,this paper introduces the implementation of the time-consuming module polynomial Central Binomial Distribution(CBD)sampling on ARM Cortex-M4 for the first time.This implementation accelerates the sampling process by an impressive 32.49%.Additionally,mixed-radix Number Theoretic
关 键 词:后量子密码 密钥封装方案 数论变换 多项式运算 ARM Cortex-M4实现
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...