SIMPLE: A Gradient Estimator for k-Subset Sampling

10/04/2022
by   Kareem Ahmed, et al.
0

k-subset sampling is ubiquitous in machine learning, enabling regularization and interpretability through sparsity. The challenge lies in rendering k-subset sampling amenable to end-to-end learning. This has typically involved relaxing the reparameterized samples to allow for backpropagation, with the risk of introducing high bias and high variance. In this work, we fall back to discrete k-subset sampling on the forward pass. This is coupled with using the gradient with respect to the exact marginals, computed efficiently, as a proxy for the true gradient. We show that our gradient estimator, SIMPLE, exhibits lower bias and variance compared to state-of-the-art estimators, including the straight-through Gumbel estimator when k = 1. Empirical results show improved performance on learning to explain and sparse linear regression. We provide an algorithm for computing the exact ELBO for the k-subset distribution, obtaining significantly lower loss compared to SOTA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2019

Differentiable Subset Sampling

Many machine learning tasks require sampling a subset of items from a co...
research
10/20/2020

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

We analyse the properties of an unbiased gradient estimator of the ELBO ...
research
09/07/2023

DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation

Computing gradients of an expectation with respect to the distributional...
research
06/09/2022

On the Bias-Variance Characteristics of LIME and SHAP in High Sparsity Movie Recommendation Explanation Tasks

We evaluate two popular local explainability techniques, LIME and SHAP, ...
research
12/11/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Stochastic gradient descent (SGD) is a cornerstone of machine learning. ...
research
06/11/2020

Probabilistic Best Subset Selection by Gradient-Based Optimization

In high-dimensional statistics, variable selection is an optimization pr...
research
06/10/2021

Bias, Consistency, and Alternative Perspectives of the Infinitesimal Jackknife

Though introduced nearly 50 years ago, the infinitesimal jackknife (IJ) ...

Please sign up or login with your details

Forgot password? Click here to reset