BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

09/19/2022
by   Mao Ye, et al.
10

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO methods need to differentiate through the low-level optimization process with implicit differentiation, which requires expensive calculations related to the Hessian matrix. There has been a recent quest for first-order methods for BO, but the methods proposed to date tend to be complicated and impractical for large-scale deep learning applications. In this work, we propose a simple first-order BO algorithm that depends only on first-order gradient information, requires no implicit differentiation, and is practical and efficient for large-scale non-convex functions in deep learning. We provide non-asymptotic convergence analysis of the proposed method to stationary points for non-convex objectives and present empirical results that show its superior practical performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2017

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

The resurgence of deep learning, as a highly effective machine learning ...
research
02/20/2023

Nystrom Method for Accurate and Scalable Implicit Differentiation

The essential difficulty of gradient-based bilevel optimization using im...
research
11/06/2018

Quasi-Newton Optimization in Deep Q-Learning for Playing ATARI Games

Reinforcement Learning (RL) algorithms allow artificial agents to improv...
research
01/26/2023

A Fully First-Order Method for Stochastic Bilevel Optimization

We consider stochastic unconstrained bilevel optimization problems when ...
research
09/27/2022

The Curse of Unrolling: Rate of Differentiating Through Optimization

Computing the Jacobian of the solution of an optimization problem is a c...
research
05/31/2021

Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective

Accelerated gradient-based methods are being extensively used for solvin...
research
11/08/2019

Penalty Method for Inversion-Free Deep Bilevel Optimization

Bilevel optimizations are at the center of several important machine lea...

Please sign up or login with your details

Forgot password? Click here to reset