机构地区:[1]公共大数据国家重点实验室,贵阳550025 [2]贵州大学计算机科学与技术学院,贵阳550025 [3]中国科学院重庆绿色智能技术研究院,重庆400714
出 处:《计算机科学》2023年第8期321-332,共12页Computer Science
基 金:贵州省科技计划项目([2020]4Y056);科技部重点研发计划项目(2020YFA0712303);重庆市科技项目(cstc2021jcyj-msxmX0821,cstc2020yszx-jcyjX0005,cstc2021yszx-jcyjX0004,2022YSZX-JCX0011CSTB,2021000263)。
摘 要:随着大数据、云计算技术的发展,用户对于云计算服务的需求也与日俱增。在用户申请云计算服务时,其隐私数据需要在云平台进行存储与计算,而这也带来了隐私数据泄露的问题。同态加密允许在不解密的情况下对密文进行直接运算,得到的新密文解密后即为运算结果,因此可以用于保障用户的隐私数据安全。在半诚实模型下考虑如下两方面的计算框架:用户端按照指定方式将隐私数据加密为密文后发送到服务器端,服务器端根据同态加密方案允许明文与密文间进行运算的性质,使用训练得到的明文模型对用户端发送来的加密数据进行分类,最后将加密的分类结果发送回用户端,由用户端自行解密获得隐私数据的分类结果。在这个框架下,基于同态加密方案BGV设计了超平面分类器、决策树以及KNN这3种机器学习分类算法。根据每种分类器的特性,结合SIMD技术设计不同的密文数据打包策略与分类计算流程,使得用户端与服务器端之间的通信开销大幅降低。特别地,在预测阶段,超平面分类器与决策树实现了无交互的分类,KNN仅需1次交互即可完成分类,并基于HElib同态加密库,采用C++语言实现了这3种分类器。在UCI公开数据集上,超平面分类器能够在几十毫秒到几百毫秒内完成对1个待预测样本的分类,决策树最慢能够在几十毫秒内完成,两种分类器对密文数据的预测准确率均能超过90%,两方仅需要承担用户端发送给服务器端的加密隐私数据与服务器端发送回用户端的加密分类标签的通信开销;KNN分类器平均4s左右完成对1个待预测样本的分类,对密文数据的预测准确率在90%以上,两方除了隐私数据与分类标签的通信开销外,只需要额外负担一轮服务器端与用户端的中间计算结果即可完成分类。与基于同态加密的同类协议相比,在通信轮数、预测准确率、运行效率等方面�With the development of big data and cloud computing,the demand for cloud computing services is growing dramatically.When users apply for cloud computing services,their privacy data needs to be stored and computed on cloud platforms,which may cause leakage of private data.Homomorphic encryption allows direct computation on ciphertexts,and the decryption of the resulting ciphertext is the same as computing on plaintexts,so homomorphic encryption can protect users'private data.Here a framework for two parties in the semi-honest model is considered.The client encrypts the privacy data into ciphertext according to a homomorphic encryption scheme and sends it to the server,and the server uses the plain machine learning model to classify the encrypted data from the client.Finally,the server sends the encrypted classification result back to the client,and the client decrypts the classification result by itself.With the framework above,three machine learning classifiers,the hyperplane,decision tree,and k-nearest neighbor classifier,based on the Brakerski-Gentry-Vaikuntanathan(BGV)homomorphic encryption scheme are investigated.According to the characteristics of each classifier,different ciphertext data packaging strategies and calculation processes are designed with single-instruction-multiple-data(SIMD)technology,which significantly reduces the communication overhead between the client and the server.In the prediction phase,the hyperplane and decision tree classifiers achieve interaction-free,and the KNN classifier only needs one interaction.Moreover,the three classifiers are implemented with a homomorphic encryption library HElib.For several UCI public datasets,the hyperplane classifier can complete the privacy-preserving classification within tens of milliseconds to hundreds of milliseconds for a single sample,and the decision tree can complete it within tens of milliseconds.The prediction accuracy of the first two classifiers for ciphertext data exceeds 90%,and the two parties only need the communication cost of the
关 键 词:同态加密 安全多方计算 隐私保护 机器学习 HElib
分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...