机构地区:[1]中国人民大学信息学院
出 处:《计算机学报》2019年第7期1640-1670,共31页Chinese Journal of Computers
基 金:国家自然科学基金项目(71531012,61762073,71601013);国家社会科学基金(18ZDA309);北京市自然科学基金(4172032,4174087);北大方正集团有限公司数字出版技术国家重点实验室开放课题资助~~
摘 要:极限学习机(Extreme Learning Machine,ELM)作为前馈神经网络学习中一种全新的训练框架,在行为识别、情感识别和故障诊断等方面被广泛应用,引起了各个领域的高度关注和深入研究.ELM最初是针对单隐层前馈神经网络的学习速度而提出的,之后又被众多学者扩展到多隐层前馈神经网络中.该算法的核心思想是随机选取网络的输入权值和隐层偏置,在训练过程中保持不变,仅需要优化隐层神经元个数.网络的输出权值则是通过最小化平方损失函数,来求解Moore - Penrose广义逆运算得到最小范数最小二乘解.相比于其它传统的基于梯度的前馈神经网络学习算法,ELM具有实现简单,学习速度极快和人为干预较少等显著优势,已成为当前人工智能领域最热门的研究方向之一.ELM的学习理论表明,当隐层神经元的学习参数独立于训练样本随机生成,只要前馈神经网络的激活函数是非线性分段连续的,就可以逼近任意连续目标函数或分类任务中的任何复杂决策边界.近年来,随机神经元也逐步在越来越多的深度学习中使用,而ELM可以为其提供使用的理论基础.本文首先概述了ELM的发展历程,接着详细阐述了ELM的工作原理.然后对ELM理论和应用的最新研究进展进行了归纳总结,着重讨论并分析了自ELM提出以来的主要学习算法和模型,包括提出的原因、核心思想、求解方法、各自的优缺点以及相关问题.最后,针对当前的研究现状,指出了ELM存在的争议、问题和挑战,并对未来的研究方向和发展趋势进行了展望。Extreme Learning Machine (ELM) as a new single hidden layer feedforward neural network (SLFN) learning framework has obtained extensive attention and in-depth research in various domains. It has been widely used in many applications, such as action recognition, emotion recognition, fault diagnosis, and so on. ELM was originally proposed for “generalized” single hidden layer feedforward neural networks to overcome the challenging issues faced by back-propagation (BP) learning algorithm and its variants. Recent studies show that ELM can be extended to “generalized” multilayer feedforward neural networks in which a hidden node could be a subnetwork of nodes or a combination of other hidden nodes. ELM provides an efficient and unified learning framework for regression, classification, feature learning, and clustering. The learning theories of ELM show that when learning parameters of hidden layer nodes are generated independently of training samples, as long as the activation function of feedforward neural network is non-linear and continuous, it can approach any continuous objective function or any complex decision boundary in the classification task. In ELM, the input weights and hidden biases connecting the input layer and the hidden layer can be independent of the training sample and randomly generated from any continuous probability distribution. The output weight matrix between the hidden layer and the output layer is obtained by minimizing the square loss function and solving the Moore - Penrose generalized inverse operation to obtain the minimum norm least squares solution. The only parameter that needs to be optimized is the number of hidden layer nodes. It has been shown by theoretical studies that ELM is capable of maintaining the universal approximation and classification capability of SLFNs even if it works with randomly generated hidden nodes. Different from traditional gradient-based neural network learning algorithms, which are sensitive to the combination of parameters and easy to trap in loc
关 键 词:极限学习机 网络结构 正则化 核学习 深度学习 在线学习 并行计算
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...