正则化在逻辑回归与神经网络中的应用研究  被引量:8

Research on regularization method applied to logistic regression and neural networks

在线阅读下载全文

作  者:朱劲夫 刘明哲[1] 赵成强[1] 苏世熙 

机构地区:[1]成都理工大学核技术与自动化工程学院,成都610059

出  处:《信息技术》2016年第7期1-5,共5页Information Technology

基  金:国家自然科学基金(41274109);四川省青年科技创新研究团队(2015TD0020)

摘  要:在机器学习算法的应用中,当使用小规模、多特征数的训练样本时,模型容易出现过拟合现象。正则化方法可以在一定程度上抑制模型过拟合,提高模型的泛化能力。以手写数字识别为例,分别研究了正则化方法在逻辑回归和BP神经网络中的应用,并比较了两种算法的实际结果。从MNIST手写体数据库中随机选取了5000个样本,经过PCA(Principal Component Analysis)降维后,从中选取不同规模样本并分别将其随机划分为60%的训练集,20%的交叉验证集和20%的测试集。分别构建两种算法对样本进行训练和测试,通过学习曲线选取合适的正则化参数,并比较了在合适正则化参数与未加入正则化参数下,模型与对测试集的预测精度。实验结果表明BP神经网络对手写数字的识别效果优于逻辑回归;同时当使用样本集较小时,正则化方法可以有效地抑制模型过拟合的发生,提高模型预测精度;随着样本集规模的增大,抑制效果减弱。In the applications of machine learning algorithms,models are over-fitted easily when the train sethas small scalar and large features. The regularization method can avoid models being over-fitted and improve thegeneralization ability of mo^lels in some degrees. Taking the handwritten numeral recognitions problems forexample,this paper applied the regularization methodls to logistic regression and BP neural network,respectively.What’s more,two algorithms’ results were compared. Five thousand samples were selected from MNIST databaserandomly and were reduced dimensions by PCA (Principal Component Analysis). Different sizes of samples wereselected and randomly divided into train set (60% ),cross-validation set (2 0 % ) and test set (20% ),respectively. Two m(xdels were established to train and test samples. Appropriate regularization parameters werechosen by learning curves. The prediction accuracy was obtained and compared in two conditions: appropriateregularization parameter and no regularization. The results reflect that BP neural networks performs better thanlogistic regression. What ’ s more, when the train set with small scalar is used,regularization method canefficiently avoid models being over fitted and the performance is worse if the scalar of train set is enlarged.

关 键 词:逻辑回归 神经网络 正则化 泛化能力 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象