Semi-Cyclic Stochastic Gradient Descent

04/23/2019
by   Hubert Eichner, et al.
0

We consider convex SGD updates with a block-cyclic structure, i.e. where each cycle consists of a small number of blocks, each with many samples from a possibly different, block-specific, distribution. This situation arises, e.g., in Federated Learning where the mobile devices available for updates at different times during the day have different characteristics. We show that such block-cyclic structure can significantly deteriorate the performance of SGD, but propose a simple approach that allows prediction with the same performance guarantees as for i.i.d., non-cyclic, sampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2021

Demystifying the Effects of Non-Independence in Federated Learning

Federated Learning (FL) enables statistical models to be built on user-g...
research
02/18/2020

Distributed Optimization over Block-Cyclic Data

We consider practical data characteristics underlying federated learning...
research
07/11/2019

Amplifying Rényi Differential Privacy via Shuffling

Differential privacy is a useful tool to build machine learning models w...
research
03/24/2020

FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection

As massive data are produced from small gadgets, federated learning on m...
research
12/09/2022

Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization

Nonconvex optimization is central in solving many machine learning probl...
research
02/10/2023

Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD

Cyclic and randomized stepsizes are widely used in the deep learning pra...
research
11/20/2017

Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks

We present a stochastic first-order optimization algorithm, named BCSC, ...

Please sign up or login with your details

Forgot password? Click here to reset