检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《自动化学报》2016年第6期953-958,共6页Acta Automatica Sinica
基 金:国家自然科学基金(61271143)资助~~
摘 要:一个设计良好的学习率策略可以显著提高深度学习模型的收敛速度,减少模型的训练时间.本文针对AdaGrad和AdaDec学习策略只对模型所有参数提供单一学习率方式的问题,根据模型参数的特点,提出了一种组合型学习策略:AdaMix.该策略为连接权重设计了一个仅与当前梯度有关的学习率,为偏置设计使用了幂指数型学习率.利用深度学习模型Autoencoder对图像数据库MNIST进行重构,以模型反向微调过程中测试阶段的重构误差作为评价指标,验证几种学习策略对模型收敛性的影响.实验结果表明,AdaMix比AdaGrad和AdaDec的重构误差小并且计算量也低,具有更快的收敛速度.A good learning rate scheduling can significantly improve the convergence rate of the deep learning model and reduce the training time. The AdaGrad and AdaDec learning strategies only provide a single form learning rate for all the parameters of the deep learning model. In this paper, AdaMix is proposed. According to the characteristics of the model parameters, and a learning rate form which is only based on the current epoch gradient is designed for the connection weights, a power exponential learning rate form is used for the bias. The test reconstruction error in the fine-turning phase of the deep learning model is used as the evaluation index. In order to verify the convergence of the deep learning based on different learning rate strategies, Autoencoder, a deep learning model, is trained to restructure the MNIST database. The experimental results show that Adamix has the lowest reconstruction error and minimum calculation compared with AdaGrad and AdaDec, so the deep learning model can quickly converge by using AdaMix.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.15.246