On Exploration, Exploitation and Learning in Adaptive Importance Sampling

10/31/2018
by   Xiaoyu Lu, et al.
16

We study adaptive importance sampling (AIS) as an online learning problem and argue for the importance of the trade-off between exploration and exploitation in this adaptation. Borrowing ideas from the bandits literature, we propose Daisee, a partition-based AIS algorithm. We further introduce a notion of regret for AIS and show that Daisee has O(√(T)( T)^3/4) cumulative pseudo-regret, where T is the number of iterations. We then extend Daisee to adaptively learn a hierarchical partitioning of the sample space for more efficient sampling and confirm the performance of both algorithms empirically.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2021

Adaptive Importance Sampling for Finite-Sum Optimization and Sampling with Decreasing Step-Sizes

Reducing the variance of the gradient estimator is known to improve the ...
research
02/09/2023

The Sample Complexity of Approximate Rejection Sampling with Applications to Smoothed Online Learning

Suppose we are given access to n independent samples from distribution μ...
research
12/15/2020

Policy Optimization as Online Learning with Mediator Feedback

Policy Optimization (PO) is a widely used approach to address continuous...
research
09/12/2017

Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits

In this paper, we propose and study opportunistic bandits - a new varian...
research
06/09/2019

Balanced Off-Policy Evaluation in General Action Spaces

In many practical applications of contextual bandits, online learning is...
research
05/22/2023

Hierarchical Partitioning Forecaster

In this work we consider a new family of algorithms for sequential predi...
research
06/09/2019

Balanced Off-Policy Evaluation General Action Spaces

In many practical applications of contextual bandits, online learning is...

Please sign up or login with your details

Forgot password? Click here to reset