Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

by   Anish Acharya, et al.

Geometric median (Gm) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0.5. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) for high-dimensional optimization problems. In this paper, we show that by applying Gm to only a judiciously chosen block of coordinates at a time and using a memory mechanism, one can retain the breakdown point of 0.5 for smooth non-convex problems, with non-asymptotic convergence rates comparable to the SGD with Gm.


On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms

Stochastic gradient descent (SGD) algorithm is the method of choice in m...

Shuffle SGD is Always Better than SGD: Improved Analysis of SGD with Arbitrary Data Orders

Stochastic Gradient Descent (SGD) algorithms are widely used in optimizi...

Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT

We provide non-asymptotic convergence rates of the Polyak-Ruppert averag...

Global Convergence and Stability of Stochastic Gradient Descent

In machine learning, stochastic gradient descent (SGD) is widely deploye...

SGD with Coordinate Sampling: Theory and Practice

While classical forms of stochastic gradient descent algorithm treat the...

Online stochastic Newton methods for estimating the geometric median and applications

In the context of large samples, a small number of individuals might spo...

Please sign up or login with your details

Forgot password? Click here to reset