Sketched Newton-Raphson

by   Rui Yuan, et al.

We propose a new globally convergent stochastic second order method. Our starting point is the development of a new Sketched Newton-Raphson (SNR) method for solving large scale nonlinear equations of the form F(x)=0 with F: ℝ^d →ℝ^d. We then show how to design several stochastic second order optimization methods by re-writing the optimization problem of interest as a system of nonlinear equations and applying SNR. For instance, by applying SNR to find a stationary point of a generalized linear model (GLM), we derive completely new and scalable stochastic second order methods. We show that the resulting method is very competitive as compared to state-of-the-art variance reduced methods. Using a variable splitting trick, we also show that the Stochastic Newton method (SNM) is a special case of SNR, and use this connection to establish the first global convergence theory of SNM. Indeed, by showing that SNR can be interpreted as a variant of the stochastic gradient descent (SGD) method we are able to leverage proof techniques of SGD and establish a global convergence theory and rates of convergence for SNR. As a special case, our theory also provides a new global convergence theory for the original Newton-Raphson method under strictly weaker assumptions as compared to what is commonly used for global convergence. There are many ways to re-write an optimization problem as nonlinear equations. Each re-write would lead to a distinct method when using SNR. As such, we believe that SNR and its global convergence theory will open the way to designing and analysing a host of new stochastic second order methods.


page 6

page 39


Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates

We present two new remarkably simple stochastic second-order methods for...

SAN: Stochastic Average Newton Algorithm for Minimizing Finite Sums

We present a principled approach for designing stochastic Newton methods...

Stochastic Polyak Stepsize with a Moving Target

We propose a new stochastic gradient method that uses recorded past loss...

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

In this work we describe an Adaptive Regularization using Cubics (ARC) m...

A Unifying Framework for Convergence Analysis of Approximate Newton Methods

Many machine learning models are reformulated as optimization problems. ...

Random directions stochastic approximation with deterministic perturbations

We introduce deterministic perturbation schemes for the recently propose...

The W4 method: a new multi-dimensional root-finding scheme for nonlinear systems of equations

We propose a new class of method for solving nonlinear systems of equati...

Please sign up or login with your details

Forgot password? Click here to reset