Approximate Cross-validation: Guarantees for Model Assessment and Selection

03/02/2020
by   Ashia Wilson, et al.
0

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets. Recent work in empirical risk minimization (ERM) approximates the expensive refitting with a single Newton step warm-started from the full training set optimizer. While this can greatly reduce runtime, several open questions remain including whether these approximations lead to faithful model selection and whether they are suitable for non-smooth objectives. We address these questions with three main contributions: (i) we provide uniform non-asymptotic, deterministic model assessment guarantees for approximate CV; (ii) we show that (roughly) the same conditions also guarantee model selection performance comparable to CV; (iii) we provide a proximal Newton extension of the approximate CV framework for non-smooth prediction problems and develop improved assessment guarantees for problems such as l1-regularized ERM.

READ FULL TEXT
research
03/15/2023

Distribution-free Deviation Bounds of Learning via Model Selection with Cross-validation Risk Estimation

Cross-validation techniques for risk estimation and model selection are ...
research
03/05/2023

Iterative Approximate Cross-Validation

Cross-validation (CV) is one of the most popular tools for assessing and...
research
04/24/2019

Bayesian leave-one-out cross-validation for large data

Model inference, such as model comparison, model checking, and model sel...
research
09/25/2022

Algorithms that Approximate Data Removal: New Results and Limitations

We study the problem of deleting user data from machine learning models ...
research
12/24/2020

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

As the main workhorse for model selection, Cross Validation (CV) has ach...
research
01/20/2022

Nonnested model selection based on empirical likelihood

We propose an empirical likelihood ratio test for nonparametric model se...
research
04/02/2020

Approximate Selection with Guarantees using Proxies

Due to the falling costs of data acquisition and storage, researchers an...

Please sign up or login with your details

Forgot password? Click here to reset