Memory-based Optimization Methods for Model-Agnostic Meta-Learning

by   Bokun Wang, et al.

Recently, model-agnostic meta-learning (MAML) has garnered tremendous attention. However, stochastic optimization of MAML is still immature. Existing algorithms for MAML are based on the “episode" idea by sampling a number of tasks and a number of data points for each sampled task at each iteration for updating the meta-model. However, they either do not necessarily guarantee convergence with a constant mini-batch size or require processing a larger number of tasks at every iteration, which is not viable for continual learning or cross-device federated learning where only a small number of tasks are available per-iteration or per-round. This paper addresses these issues by (i) proposing efficient memory-based stochastic algorithms for MAML with a diminishing convergence error, which only requires sampling a constant number of tasks and a constant number of examples per-task per-iteration; (ii) proposing communication-efficient distributed memory-based MAML algorithms for personalized federated learning in both the cross-device (w/ client sampling) and the cross-silo (w/o client sampling) settings. The key novelty of the proposed algorithms is to maintain an individual personalized model (aka memory) for each task besides the meta-model and only update them for the sampled tasks by a momentum method that incorporates historical updates at each iteration. The theoretical results significantly improve the optimization theory for MAML and the empirical results also corroborate the theory.


page 1

page 2

page 3

page 4


Achieving Linear Speedup in Non-IID Federated Bilevel Learning

Federated bilevel optimization has received increasing attention in vari...

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes

In this work, we consider the problem of minimizing the sum of Moreau en...

PersA-FL: Personalized Asynchronous Federated Learning

We study the personalized federated learning problem under asynchronous ...

Motley: Benchmarking Heterogeneity and Personalization in Federated Learning

Personalized federated learning considers learning models unique to each...

Constant Memory Attentive Neural Processes

Neural Processes (NPs) are efficient methods for estimating predictive u...

Convergence and Accuracy Trade-Offs in Federated Learning and Meta-Learning

We study a family of algorithms, which we refer to as local update metho...

SimFBO: Towards Simple, Flexible and Communication-efficient Federated Bilevel Learning

Federated bilevel optimization (FBO) has shown great potential recently ...

Please sign up or login with your details

Forgot password? Click here to reset