Wesley Chung | DeepAI

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Bo Dai
134 publications
Csaba Szepesvari
116 publications
Dale Schuurmans
78 publications
Martha White
65 publications
Vincent Liu
17 publications
Jincheng Mei
11 publications
Jian Qian
10 publications
Valentin Thomas
6 publications
Matthew Schlegel
5 publications
Daniel Graves
5 publications
Touqir Sajed
4 publications

research

∙ 01/16/2023

The Role of Baselines in Policy Gradient Optimization

We study the effect of baselines in on-policy stochastic policy gradient...

12 Jincheng Mei, et al. ∙

research

∙ 07/05/2019

Incrementally Learning Functions of the Return

Temporal difference methods enable efficient estimation of value functio...

0 Brendan Bennett, et al. ∙

research

∙ 06/11/2019

Importance Resampling for Off-policy Prediction

Importance sampling (IS) is a common reweighting strategy for off-policy...

3 Matthew Schlegel, et al. ∙

research

∙ 08/28/2018

High-confidence error estimates for learned value functions

Estimating the value function for a fixed policy is a fundamental proble...

2 Touqir Sajed, et al. ∙

Success!

An error occurred