Aligned Weight Regularizers for Pruning Pretrained Neural Networks

04/04/2022
by   James O'Neill, et al.
0

While various avenues of research have been explored for iterative pruning, little is known what effect pruning has on zero-shot test performance and its potential implications on the choice of pruning criteria. This pruning setup is particularly important for cross-lingual models that implicitly learn alignment between language representations during pretraining, which if distorted via pruning, not only leads to poorer performance on language data used for retraining but also on zero-shot languages that are evaluated. In this work, we show that there is a clear performance discrepancy in magnitude-based pruning when comparing standard supervised learning to the zero-shot setting. From this finding, we propose two weight regularizers that aim to maximize the alignment between units of pruned and unpruned networks to mitigate alignment distortion in pruned cross-lingual models and perform well for both non zero-shot and zero-shot settings. We provide experimental results on cross-lingual tasks for the zero-shot setting using XLM-RoBERTa_Base, where we also find that pruning has varying degrees of representational degradation depending on the language corresponding to the zero-shot test set. This is also the first study that focuses on cross-lingual language model compression.

READ FULL TEXT

page 6

page 7

research
09/15/2019

Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model

Because it is not feasible to collect training data for every language, ...
research
01/26/2021

Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks

In zero-shot cross-lingual transfer, a supervised NLP task trained on a ...
research
09/25/2021

Language Model Priming for Cross-Lingual Event Extraction

We present a novel, language-agnostic approach to "priming" language mod...
research
05/06/2021

XeroAlign: Zero-Shot Cross-lingual Transformer Alignment

The introduction of pretrained cross-lingual language models brought dec...
research
10/10/2020

Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns

This paper describes our submission of the WMT 2020 Shared Task on Sente...
research
04/19/2019

Zero-Shot Cross-Lingual Opinion Target Extraction

Aspect-based sentiment analysis involves the recognition of so called op...
research
04/16/2020

Cross-lingual Contextualized Topic Models with Zero-shot Learning

Many data sets in a domain (reviews, forums, news, etc.) exist in parall...

Please sign up or login with your details

Forgot password? Click here to reset