Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment

10/29/2021
by   Dang Nguyen, et al.
11

Layer-wise model fusion via optimal transport, named OTFusion, applies soft neuron association for unifying different pre-trained networks to save computational resources. While enjoying its success, OTFusion requires the input networks to have the same number of layers. To address this issue, we propose a novel model fusion framework, named CLAFusion, to fuse neural networks with a different number of layers, which we refer to as heterogeneous neural networks, via cross-layer alignment. The cross-layer alignment problem, which is an unbalanced assignment problem, can be solved efficiently using dynamic programming. Based on the cross-layer alignment, our framework balances the number of layers of neural networks before applying layer-wise model fusion. Our synthetic experiments indicate that the fused network from CLAFusion achieves a more favorable performance compared to the individual networks trained on heterogeneous data without the need for any retraining. With an extra fine-tuning process, it improves the accuracy of residual networks on the CIFAR10 dataset. Finally, we explore its application for model compression and knowledge distillation when applying to the teacher-student setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2022

Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural Networks

Existing knowledge distillation methods on graph neural networks (GNNs) ...
research
11/21/2019

Few Shot Network Compression via Cross Distillation

Model compression has been widely adopted to obtain light-weighted deep ...
research
10/12/2019

Model Fusion via Optimal Transport

Combining different models is a widely used paradigm in machine learning...
research
08/27/2021

CoCo DistillNet: a Cross-layer Correlation Distillation Network for Pathological Gastric Cancer Segmentation

In recent years, deep convolutional neural networks have made significan...
research
05/17/2019

Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation

Convolutional neural networks have been widely deployed in various appli...
research
05/24/2018

Multi-Task Zipping via Layer-wise Neuron Sharing

Future mobile devices are anticipated to perceive, understand and react ...
research
01/07/2022

Compressing Models with Few Samples: Mimicking then Replacing

Few-sample compression aims to compress a big redundant model into a sma...

Please sign up or login with your details

Forgot password? Click here to reset