基于GPU加速的快速全同态加密算法设计与实现

Design and implementation of fast fully homomorphic encryption algorithm based on GPU acceleration

作　　者：谭泽玖赵鑫万俊平刘虎成蒋琳徐金明纪守领王轩 TAN Zejiu;ZHAO Xin;WAN Junping;LIU Hucheng;JIANG Lin;XU jinming;JI Shouling;WANG Xuan(School of computer science and technology,Harbin Institute of Technology,Shenzhen,Shenzhen,518055,China;College of control science and engineering,Zhejiang University,Hangzhou,310058,China;College of Computer Science and Technology,Zhejiang University,Hangzhou,310058,China;Pengcheng National Laboratory,Shenzhen 518000,China;Guangdong Key Laboratory of New Security and Intelligence Technology,Shenzhen 518005,China)

机构地区：[1]哈尔滨工业大学(深圳)计算机科学与技术学院,深圳518055 [2]浙江大学控制科学与工程学院,杭州310058 [3]浙江大学计算机科学与技术学院,杭州310058 [4]鹏城国家实验室,深圳518000 [5]广东省安全智能新技术重点实验室,深圳518055

出　　处：《网络空间安全科学学报》2024年第3期41-52,共12页Journal of Cybersecurity

基　　金：国家重点研发计划项目(2022YFB3102100)。

摘　　要：全同态加密算法支持直接对加密数据(密文)执行代数运算,但其密文评估中的数论变换(NTT)涉及大量高维度整系数多项式环运算,限制了其在隐私计算中的应用。针对CPU实现方案对NTT算法计算并行度较低的问题,提出一种CPU+GPU异构的CKKS全同态加密实现方案。首先,根据NTT算法数据内存访问规律,设计一种数据暂存共享内存策略,有效减少频繁的全局内存访问。其次,针对数据规模可变导致内核出现部分空闲线程的问题,设计线程工作负载动态分配机制,并采用不同基数的蝴蝶变换结构,提高数据输入的灵活性并优化并行策略。再次,提出单—多内核混合调用模式,通过NTT算法蝶形变换分组大小动态切换内核调用模式,充分利用GPU多核调用的并行潜力。最后,设计并实现并行程度更高、计算复杂度更低的NTT算法,利用该算法实现并行的同态乘法运算,并基于HElib库实现CPU+GPU异构的CKKS全同态加密算法。实验结果表明,与使用AVX-512加速的HElib库相比,所提的NTT/INTT计算时间减短近65%。Fully homomorphic encryption supports direct algebraic operations on encrypted data(ciphertext),with the foundation of its ciphertext evaluation phase involving numerous high-dimensional integer coefficient polynomial ring additions and multiplications.This limits its widespread application in the field of privacy computing.The CPU implementation scheme offers low parallelism for the Number Theoretic Transform(NTT)algorithm calculations.A CPU+GPU heterogeneous fully homomorphic encryption im-plementation scheme was proposed.Firstly,a cache strategy of data temporarily stored in shared memory was introduced,which stored repeatedly read and unchanging data,including NTT input data and rotation factors,in shared memory to reduce frequent glob-al memory access.Secondly,to address the issue of partially idle threads caused by variable data sizes,it dynamically allocates thread workloads based on data size and hardware resources,adopting butterfly transformation structures of different radices to achieve opti-mal parallel strategies while enhancing the flexibility of data input.Thirdly,it introduces a single-multi-core mixed invocation mode,dynamically switching the kernel invocation mode based on the group size of butterfly transformations in each NTT iteration,to fully utilize the parallel potential of multi-core invocations on GPU.Finally,it designs and implements a higher parallelism,lower computa-tional complexity NTT algorithm for GPU,uses this algorithm to perform parallel homomorphic multiplication operations,and imple-ments a CPU+GPU heterogeneous CKKS fully homomorphic encryption algorithm based on the HElib library.Experimental results show NTT/INTT computation time is reduced by nearly 65%compared to HElib library using AVX-512 acceleration technology.

关键词：隐私计算全同态加密 GPU并行加速数论变换缓存策略卷积神经网络

分类号：TN918.4[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于GPU加速的快速全同态加密算法设计与实现

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于GPU加速的快速全同态加密算法设计与实现

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索