Learning Action Representations for Reinforcement Learning

02/01/2019
by   Yash Chandak, et al.
0

Most model-free reinforcement learning methods leverage state representations (embeddings) for generalization, but either ignore structure in the space of actions or assume the structure is provided a priori. We show how a policy can be decomposed into a component that acts in a low-dimensional space of action representations and a component that transforms these representations into actual actions. These representations improve generalization over large, finite action sets by allowing the agent to infer the outcomes of actions similar to actions already taken. We provide an algorithm to both learn and use action representations and provide conditions for its convergence. The efficacy of the proposed method is demonstrated on large-scale real-world problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2020

Joint State-Action Embedding for Efficient Reinforcement Learning

While reinforcement learning has achieved considerable successes in rece...
research
02/04/2019

The Natural Language of Actions

We introduce Act2Vec, a general framework for learning context-based act...
research
11/23/2022

Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning

Deep reinforcement learning (DRL) breaks through the bottlenecks of trad...
research
06/28/2023

DCT: Dual Channel Training of Action Embeddings for Reinforcement Learning with Large Discrete Action Spaces

The ability to learn robust policies while generalizing over large discr...
research
12/12/2012

The Thing That We Tried Didn't Work Very Well : Deictic Representation in Reinforcement Learning

Most reinforcement learning methods operate on propositional representat...
research
12/13/2019

Long-Term Planning and Situational Awareness in OpenAI Five

Understanding how knowledge about the world is represented within model-...
research
06/05/2019

Lifelong Learning with a Changing Action Set

In many real-world sequential decision making problems, the number of av...

Please sign up or login with your details

Forgot password? Click here to reset