Optimization-inspired manual architecture design and neural architecture search  

在线阅读下载全文

作  者:Yibo YANG Zhengyang SHEN Huan LI Zhouchen LIN 

机构地区:[1]JD Explore Academy,Beijing 100176,China [2]School of Mathematical Sciences,Peking University,Beijing 100871,China [3]Institute of Robotics and Automatic Information Systems,College of Artificial Intelligence,Nankai University,Tianjin 300071,China [4]Key Lab of General Artificial Intelligence,School of Intelligence Science and Technology,Peking University,Beijing 100871,China [5]Institute for Artificial Intelligence,Peking University,Beijing 100871,China [6]Pazhou Lab,Guangzhou 510335,China

出  处:《Science China(Information Sciences)》2023年第11期96-108,共13页中国科学(信息科学)(英文版)

基  金:supported by National Key R&D Program of China(Grant No.2022ZD0160302);National Natural Science Foundation of China(Grant No.62276004)。

摘  要:Neural architecture has been a research focus in recent years due to its importance in deciding the performance of deep networks.Representative ones include a residual network(Res Net)with skip connections and a dense network(Dense Net)with dense connections.However,a theoretical guidance for manual architecture design and neural architecture search(NAS)is still lacking.In this paper,we propose a manual architecture design framework,which is inspired by optimization algorithms.It is based on the conjecture that an optimization algorithm with a good convergence rate may imply a neural architecture with good performance.Concretely,we prove under certain conditions that forward propagation in a deep neural network is equivalent to the iterative optimization procedure of the gradient descent algorithm minimizing a cost function.Inspired by this correspondence,we derive neural architectures from fast optimization algorithms,including the heavy ball algorithm and Nesterov’s accelerated gradient descent algorithm.Surprisingly,we find that we can deem the Res Net and Dense Net as special cases of the optimization-inspired architectures.These architectures offer not only theoretical guidance,but also good performances in image recognition on multiple datasets,including CIFAR-10,CIFAR-100,and Image Net.Moreover,we show that our method is also useful for NAS by offering a good initial search point or guiding the search space.

关 键 词:deep neural network manual architecture design neural architecture search image recognition optimization algorithms learning-based optimization 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] TP391.41[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象