Policy Gradient Optimal Correlation Search for Variance Reduction in Monte Carlo simulation and Maximum Optimal Transport
We propose a new algorithm for variance reduction when estimating f(X_T) where X is the solution to some stochastic differential equation and f is a test function. The new estimator is (f(X^1_T) + f(X^2_T))/2, where X^1 and X^2 have same marginal law as X but are pathwise correlated so that to reduce the variance. The optimal correlation function ρ is approximated by a deep neural network and is calibrated along the trajectories of (X^1, X^2) by policy gradient and reinforcement learning techniques. Finding an optimal coupling given marginal laws has links with maximum optimal transport.
READ FULL TEXT