Mix and Mask Actor-Critic Methods

06/24/2021
by   Dom Huh, et al.
0

Shared feature spaces for actor-critic methods aims to capture generalized latent representations to be used by the policy and value function with the hopes for a more stable and sample-efficient optimization. However, such a paradigm present a number of challenges in practice, as parameters generating a shared representation must learn off two distinct objectives, resulting in competing updates and learning perturbations. In this paper, we present a novel feature-sharing framework to address these difficulties by introducing the mix and mask mechanisms and the distributional scalarization technique. These mechanisms behaves dynamically to couple and decouple connected latent features variably between the policy and value function, while the distributional scalarization standardizes the two objectives using a probabilistic standpoint. From our experimental results, we demonstrate significant performance improvements compared to alternative methods using separate networks and networks with a shared backbone.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2020

Phasic Policy Gradient

We introduce Phasic Policy Gradient (PPG), a reinforcement learning fram...
research
09/06/2021

Error Controlled Actor-Critic

On error of value function inevitably causes an overestimation phenomeno...
research
08/03/2021

Variational Actor-Critic Algorithms

We introduce a class of variational actor-critic algorithms based on a v...
research
02/01/2023

Distillation Policy Optimization

On-policy algorithms are supposed to be stable, however, sample-intensiv...
research
06/29/2023

Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective

Feature transformation aims to reconstruct an effective representation s...
research
06/19/2020

Band-limited Soft Actor Critic Model

Soft Actor Critic (SAC) algorithms show remarkable performance in comple...

Please sign up or login with your details

Forgot password? Click here to reset