Stochastic Doubly Robust Gradient

12/21/2018
by   Kanghoon Lee, et al.
0

When training a machine learning model with observational data, it is often encountered that some values are systemically missing. Learning from the incomplete data in which the missingness depends on some covariates may lead to biased estimation of parameters and even harm the fairness of decision outcome. This paper proposes how to adjust the causal effect of covariates on the missingness when training models using stochastic gradient descent (SGD). Inspired by the design of doubly robust estimator and its theoretical property of double robustness, we introduce stochastic doubly robust gradient (SDRG) consisting of two models: weight-corrected gradients for inverse propensity score weighting and per-covariate control variates for regression adjustment. Also, we identify the connection between double robustness and variance reduction in SGD by demonstrating the SDRG algorithm with a unifying framework for variance reduced SGD. The performance of our approach is empirically tested by showing the convergence in training image classifiers with several examples of missing data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2015

Variance Reduced Stochastic Gradient Descent with Neighbors

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, ye...
research
08/10/2021

An Analysis of Stochastic Variance Reduced Gradient for Linear Inverse Problems

Stochastic variance reduced gradient (SVRG) is a popular variance reduct...
research
10/13/2021

On the Double Descent of Random Features Models Trained with SGD

We study generalization properties of random features (RF) regression in...
research
10/25/2019

Bias-Variance Tradeoff in a Sliding Window Implementation of the Stochastic Gradient Algorithm

This paper provides a framework to analyze stochastic gradient algorithm...
research
01/05/2023

Improve Efficiency of Doubly Robust Estimator when Propensity Score is Misspecified

Doubly robust (DR) estimation is a crucial technique in causal inference...
research
08/03/2021

Normalized Augmented Inverse Probability Weighting with Neural Network Predictions

The estimation of Average Treatment Effect (ATE) as a causal parameter i...
research
01/08/2019

Double-Robust Estimation in Difference-in-Differences with an Application to Traffic Safety Evaluation

Difference-in-differences (DID) is a widely used approach for drawing ca...

Please sign up or login with your details

Forgot password? Click here to reset