检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Yibo YANG Zhengyang SHEN Huan LI Zhouchen LIN
机构地区:[1]JD Explore Academy,Beijing 100176,China [2]School of Mathematical Sciences,Peking University,Beijing 100871,China [3]Institute of Robotics and Automatic Information Systems,College of Artificial Intelligence,Nankai University,Tianjin 300071,China [4]Key Lab of General Artificial Intelligence,School of Intelligence Science and Technology,Peking University,Beijing 100871,China [5]Institute for Artificial Intelligence,Peking University,Beijing 100871,China [6]Pazhou Lab,Guangzhou 510335,China
出 处:《Science China(Information Sciences)》2023年第11期96-108,共13页中国科学(信息科学)(英文版)
基 金:supported by National Key R&D Program of China(Grant No.2022ZD0160302);National Natural Science Foundation of China(Grant No.62276004)。
摘 要:Neural architecture has been a research focus in recent years due to its importance in deciding the performance of deep networks.Representative ones include a residual network(Res Net)with skip connections and a dense network(Dense Net)with dense connections.However,a theoretical guidance for manual architecture design and neural architecture search(NAS)is still lacking.In this paper,we propose a manual architecture design framework,which is inspired by optimization algorithms.It is based on the conjecture that an optimization algorithm with a good convergence rate may imply a neural architecture with good performance.Concretely,we prove under certain conditions that forward propagation in a deep neural network is equivalent to the iterative optimization procedure of the gradient descent algorithm minimizing a cost function.Inspired by this correspondence,we derive neural architectures from fast optimization algorithms,including the heavy ball algorithm and Nesterov’s accelerated gradient descent algorithm.Surprisingly,we find that we can deem the Res Net and Dense Net as special cases of the optimization-inspired architectures.These architectures offer not only theoretical guidance,but also good performances in image recognition on multiple datasets,including CIFAR-10,CIFAR-100,and Image Net.Moreover,we show that our method is also useful for NAS by offering a good initial search point or guiding the search space.
关 键 词:deep neural network manual architecture design neural architecture search image recognition optimization algorithms learning-based optimization
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] TP391.41[自动化与计算机技术—控制科学与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.225.95.155