Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes

by   Konstantin Mishchenko, et al.

In this work, we consider the problem of minimizing the sum of Moreau envelopes of given functions, which has previously appeared in the context of meta-learning and personalized federated learning. In contrast to the existing theory that requires running subsolvers until a certain precision is reached, we only assume that a finite number of gradient steps is taken at each iteration. As a special case, our theory allows us to show the convergence of First-Order Model-Agnostic Meta-Learning (FO-MAML) to the vicinity of a solution of Moreau objective. We also study a more general family of first-order algorithms that can be viewed as a generalization of FO-MAML. Our main theoretical achievement is a theoretical improvement upon the inexact SGD framework. In particular, our perturbed-iterate analysis allows for tighter guarantees that improve the dependency on the problem's conditioning. In contrast to the related work on meta-learning, ours does not require any assumptions on the Hessian smoothness, and can leverage smoothness and convexity of the reformulation based on Moreau envelopes. Furthermore, to fill the gaps in the comparison of FO-MAML to the Implicit MAML (iMAML), we show that the objective of iMAML is neither smooth nor convex, implying that it has no convergence guarantees based on the existing theory.


page 1

page 2

page 3

page 4


On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms

In this paper, we study the convergence theory of a class of gradient-ba...

Memory-based Optimization Methods for Model-Agnostic Meta-Learning

Recently, model-agnostic meta-learning (MAML) has garnered tremendous at...

Multi-Step Model-Agnostic Meta-Learning: Convergence and Improved Algorithms

As a popular meta-learning approach, the model-agnostic meta-learning (M...

A Convergence Theory for Federated Average: Beyond Smoothness

Federated learning enables a large amount of edge computing devices to l...

Nonlinear Meta-Learning Can Guarantee Faster Rates

Many recent theoretical works on meta-learning aim to achieve guarantees...

Convergence of Gradient-based MAML in LQR

The main objective of this research paper is to investigate the local co...

Efficient Meta-Learning via Error-based Context Pruning for Implicit Neural Representations

We introduce an efficient optimization-based meta-learning technique for...

Please sign up or login with your details

Forgot password? Click here to reset