基于非监督预训练的结构优化卷积神经网络  被引量:5

Structure Optimized Convolutional Neural Network Based on Unsupervised Pre-training

在线阅读下载全文

作  者:刘庆[1] 唐贤伦[1] 张娜[1] 

机构地区:[1]重庆邮电大学工业物联网与网络化控制教育部重点实验室,重庆400065

出  处:《四川大学学报(工程科学版)》2017年第S2期210-215,共6页Journal of Sichuan University (Engineering Science Edition)

基  金:国家自然科学基金资助项目(61673079);重庆市基础科学与前沿技术研究项目资助(cstc2016jcyj A1919)

摘  要:针对带标签训练样本不足,典型卷积神经网络卷积核由经验设置,网络结构固定不变难以后期再学习的问题,基于稀疏自编码器(sparse autoencoder,SAE)和卷积神经网络(convolutional neural network,CNN),提出新的CNN模型。该模型将部分原始样本输入SAE模型进行训练以得到低维特征表示,并将该低维特征表示作为CNN的卷积核的初始值,不仅可以很好地克服带标签训练数据样本不足的问题,还可以提取有效特征以加速网络收敛;并且在典型CNN结构基础上增加一条网络支路,先使用所有训练样本训练典型CNN结构,再使用大部分训练样本训练支路结构,最后使用其余少部分样本进行后续再学习并只更新支路权值以增强因特征不明显而容易误判的样本的特征,从而使得整个网络记忆已有特征的同时增加新特征。文中模型在MNIST数据集上迭代更新10次网络权值可以使测试识别率达到97.65%;在手写汉字数据集HCL2000中的简单字、中等字、复杂字及相似字上的测试正确率能达93%以上;50个训练样本、250个测试样本时,相似字识别率可达80.36%,比典型CNN及传统手写汉字识别方法更具泛化性。实验表明所提方法可有效应用于手写字等图像识别应用中。Aiming at the problem that tagged training samples were insufficient,convolutional kernels of typical convolutional neural network were set by experience and it was difficult for fixed network architecture to realize subsequent re-learning,a new convoutional neural network( CNN) model was proposed based on sparse autoencoder( SAE) and typical CNN. The SAE model was used to train some of the original samples to obtain low-dimensional feature representations. The low-dimensional feature representations were used as the initial values of the convolution kernels. Not only could the model overcome the problem of insufficient tagged training samples,but also could extract effective features to speed up network convergence. A network branch was added to a typical CNN. Firstly,all training samples were used to train the typical CNN. Secondly,most of the training samples were used to train the network branch structure. Finally,the remaining part of the samples were used to do subsequent re-learning,only branch weights of the model were updated in order to enhance the characteristics of samples that were not obvious and were prone to be misjudged,so that new features were added while memorizing existing characteristics. On dataset MNIST,recognition rate of 97. 65% was achieved by updating weights for 10 times. Recognition rates of more than 93% were achieved on dataset HCL2000. The recognition rate of similar Chinese characters reached 80. 36% using 50 samples for training,250 for testing. Compared with typical CNN and traditional methods,the proposed method was more generalized. Experiments showed that the proposed method could be effectively applied to image recognition such as handwritten characters.

关 键 词:卷积神经网络 稀疏自动编码器 非监督预训练 后继再学习 手写字识别 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象