面向数据质量的隐私保护多分类LR方案  被引量:2

Privacy preserving multi-classification LR scheme for data quality

在线阅读下载全文

作  者:曹来成[1] 吴文涛 冯涛[1] 郭显[1] CAO Laicheng;WU Wentao;FENG Tao;GUO Xian(School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China)

机构地区:[1]兰州理工大学计算机与通信学院,甘肃兰州730050

出  处:《西安电子科技大学学报》2023年第5期188-198,共11页Journal of Xidian University

基  金:国家自然科学基金(61562059,61461027);甘肃省自然科学基金(20JR5RA467)。

摘  要:为了保护机器学习中多分类逻辑回归模型的隐私,保证训练数据质量并减少计算和通信开销,提出了一种面向数据质量的隐私保护多分类逻辑回归方案。首先,基于近似数算术同态加密技术,利用批处理技术和单指令多数据机制将多条消息打包成一个密文,安全地将加密的向量移位成明文向量对应的密文。其次,采用“一对其余”的拆解策略,通过训练多个分类器,将二分类逻辑回归模型推广到多分类。最后,将训练数据集划分为多个固定大小的矩阵,这些矩阵仍然保留完整的样本信息数据结构;用固定的海森方法优化模型参数,使其适用于任何情况并保证参数隐私。在模型训练期间,该方案能够减轻数据的稀疏性,并保证数据质量。安全性分析显示,整个过程中能够保证训练模型和用户数据信息都不被泄漏,同时实验表明,该方案的训练准确率比现有方案有了较大提升,与未加密数据训练得到的准确率几乎相同,且该方案具有更低的计算开销。In order to protect the privacy of the multi-classification logistic regression model in machine learning,ensure the quality of training data,and reduce the computing and communication costs,a privacy preserving multi-classification logistic regressions cheme for data quality is proposed.First,based on the homomorphic encryption for arithmetic of approximate numbers technology,the batch processing technology and single-instruction multi-data mechanism are used to package multiple messages into one ciphertext,and the encrypted vector is safely shifted into the ciphertext corresponding to the plaintext vector.Second,the binary logistic regression model is extended to multiple classifications by training multiple classifiers using the"One vs Rest"disassembly strategy.Finally,the training data set is divided into several matrices of a fixed size,which still retain the complete data structure of the sample information.The fixed Hessian method is used to optimize the model parameters so that they can be used in any case and keep the parameters private.during model training.The scheme can reduce data sparsity and ensure data quality.The security analysis shows that the training model and user data information cannot be leaked in the whole process.Meanwhile,the experiment shows that the training accuracy of this scheme is greatly improved compared with the existing scheme and almost the same as that obtained by training unencrypted data,and that the scheme has a lower computing cost.

关 键 词:同态加密 云计算 逻辑回归 隐私保护 数据质量 

分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象