WeightNet: Revisiting the Design Space of Weight Networks

07/23/2020
by   Ningning Ma, et al.
12

We present a conceptually simple, flexible and effective framework for weight generating networks. Our approach is general that unifies two current distinct and extremely effective SENet and CondConv into the same framework on weight space. The method, called WeightNet, generalizes the two methods by simply adding one more grouped fully-connected layer to the attention activation layer. We use the WeightNet, composed entirely of (grouped) fully-connected layers, to directly output the convolutional weight. WeightNet is easy and memory-conserving to train, on the kernel space instead of the feature space. Because of the flexibility, our method outperforms existing approaches on both ImageNet and COCO detection tasks, achieving better Accuracy-FLOPs and Accuracy-Parameter trade-offs. The framework on the flexible weight space has the potential to further improve the performance. Code is available at https://github.com/megvii-model/WeightNet.

READ FULL TEXT
research
07/23/2021

Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

We present a novel global compression framework for deep neural networks...
research
04/28/2020

Do We Need Fully Connected Output Layers in Convolutional Networks?

Traditionally, deep convolutional neural networks consist of a series of...
research
07/25/2019

DropAttention: A Regularization Method for Fully-Connected Self-Attention Networks

Variants dropout methods have been designed for the fully-connected laye...
research
03/25/2022

Deformable Butterfly: A Highly Structured and Sparse Linear Transform

We introduce a new kind of linear transform named Deformable Butterfly (...
research
04/08/2023

MC-MLP:Multiple Coordinate Frames in all-MLP Architecture for Vision

In deep learning, Multi-Layer Perceptrons (MLPs) have once again garnere...
research
07/21/2022

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

Recent isotropic networks, such as ConvMixer and vision transformers, ha...
research
07/16/2023

Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks

Implicit models such as Deep Equilibrium Models (DEQs) have garnered sig...

Please sign up or login with your details

Forgot password? Click here to reset