NAT: Neural Architecture Transformer for Accurate and Compact Architectures

by   Yong Guo, et al.

Designing effective architectures is one of the key factors behind the success of deep neural networks. Existing deep architectures are either manually designed or automatically searched by some Neural Architecture Search (NAS) methods. However, even a well-searched architecture may still contain many non-significant or redundant modules or operations (e.g., convolution or pooling), which may not only incur substantial memory consumption and computation cost but also deteriorate the performance. Thus, it is necessary to optimize the operations inside an architecture to improve the performance without introducing extra computation cost. Unfortunately, such a constrained optimization problem is NP-hard. To make the problem feasible, we cast the optimization problem into a Markov decision process (MDP) and seek to learn a Neural Architecture Transformer (NAT) to replace the redundant operations with the more computationally efficient ones (e.g., skip connection or directly removing the connection). Based on MDP, we learn NAT by exploiting reinforcement learning to obtain the optimization policies w.r.t. different architectures. To verify the effectiveness of the proposed strategies, we apply NAT on both hand-crafted architectures and NAS based architectures. Extensive experiments on two benchmark datasets, i.e., CIFAR-10 and ImageNet, demonstrate that the transformed architecture by NAT significantly outperforms both its original form and those architectures optimized by existing methods.


page 1

page 2

page 3

page 4


Towards Accurate and Compact Architectures via Neural Architecture Transformer

Designing effective architectures is one of the key factors behind the s...

Differentiable Neural Architecture Transformation for Reproducible Architecture Improvement

Recently, Neural Architecture Search (NAS) methods are introduced and sh...

Optimized CNN for PolSAR Image Classification via Differentiable Neural Architecture Search

Convolutional neural networks (CNNs) realize the automation of feature e...

Automatic Design of CNNs via Differentiable Neural Architecture Search for PolSAR Image Classification

Convolutional neural networks (CNNs) have shown good performance in pola...

Architecture Augmentation for Performance Predictor Based on Graph Isomorphism

Neural Architecture Search (NAS) can automatically design architectures ...

Analyze and Design Network Architectures by Recursion Formulas

The effectiveness of shortcut/skip-connection has been widely verified, ...

On-the-fly Operation Batching in Dynamic Computation Graphs

Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offe...

Please sign up or login with your details

Forgot password? Click here to reset