Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization

03/19/2023
by   Peiyuan Zhang, et al.
0

Recent progress was made in characterizing the generalization error of gradient methods for general convex loss by the learning theory community. In this work, we focus on how training longer might affect generalization in smooth stochastic convex optimization (SCO) problems. We first provide tight lower bounds for general non-realizable SCO problems. Furthermore, existing upper bound results suggest that sample complexity can be improved by assuming the loss is realizable, i.e. an optimal solution simultaneously minimizes all the data points. However, this improvement is compromised when training time is long and lower bounds are lacking. Our paper examines this observation by providing excess risk lower bounds for gradient descent (GD) and stochastic gradient descent (SGD) in two realizable settings: 1) realizable with T = O(n), and (2) realizable with T = Ω(n), where T denotes the number of training iterations and n is the size of the training dataset. These bounds are novel and informative in characterizing the relationship between T and n. In the first small training horizon case, our lower bounds almost tightly match and provide the first optimal certificates for the corresponding upper bounds. However, for the realizable case with T = Ω(n), a gap exists between the lower and upper bounds. We provide a conjecture to address this problem, that the gap can be closed by improving upper bounds, which is supported by our analyses in one-dimensional and linear regression scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2020

Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses

Uniform stability is a notion of algorithmic stability that bounds the w...
research
03/13/2023

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

We study convergence lower bounds of without-replacement stochastic grad...
research
06/05/2023

Curvature and complexity: Better lower bounds for geodesically convex optimization

We study the query complexity of geodesically convex (g-convex) optimiza...
research
03/02/2023

Tight Risk Bounds for Gradient Descent on Separable Data

We study the generalization properties of unregularized gradient methods...
research
03/26/2021

Lower Bounds on the Generalization Error of Nonlinear Learning Models

We study in this paper lower bounds for the generalization error of mode...
research
12/25/2013

A Convex Formulation for Mixed Regression with Two Components: Minimax Optimal Rates

We consider the mixed regression problem with two components, under adve...
research
07/10/2023

Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

In this paper, we provide a novel framework for the analysis of generali...

Please sign up or login with your details

Forgot password? Click here to reset