Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning

by   Momin Abbas, et al.

Model-agnostic meta learning (MAML) is currently one of the dominating approaches for few-shot meta-learning. Albeit its effectiveness, the optimization of MAML can be challenging due to the innate bilevel problem structure. Specifically, the loss landscape of MAML is much more complex with possibly more saddle points and local minimizers than its empirical risk minimization counterpart. To address this challenge, we leverage the recently invented sharpness-aware minimization and develop a sharpness-aware MAML approach that we term Sharp-MAML. We empirically demonstrate that Sharp-MAML and its computation-efficient variant can outperform popular existing MAML baselines (e.g., +12% accuracy on Mini-Imagenet). We complement the empirical study with the convergence rate analysis and the generalization bound of Sharp-MAML. To the best of our knowledge, this is the first empirical and theoretical study on sharpness-aware minimization in the context of bilevel learning. The code is available at


When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications

Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning meth...

Provable Generalization of Overparameterized Meta-learning Trained with SGD

Despite the superior empirical success of deep meta-learning, theoretica...

Torchmeta: A Meta-Learning library for PyTorch

The constant introduction of standardized benchmarks in the literature h...

Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis

We derive a novel information-theoretic analysis of the generalization p...

MetaHDR: Model-Agnostic Meta-Learning for HDR Image Reconstruction

Capturing scenes with a high dynamic range is crucial to reproducing ima...

Global Convergence and Induced Kernels of Gradient-Based Meta-Learning with Neural Nets

Gradient-based meta-learning (GBML) with deep neural nets (DNNs) has bec...

Deblurring Photographs of Characters Using Deep Neural Networks

In this paper, we present our approach for the Helsinki Deblur Challenge...

Please sign up or login with your details

Forgot password? Click here to reset