Learning RoboCup-Keepaway with Kernels

01/31/2012
by   Tobias Jung, et al.
0

We apply kernel-based methods to solve the difficult reinforcement learning problem of 3vs2 keepaway in RoboCup simulated soccer. Key challenges in keepaway are the high-dimensionality of the state space (rendering conventional discretization-based function approximation like tilecoding infeasible), the stochasticity due to noise and multiple learning agents needing to cooperate (meaning that the exact dynamics of the environment are unknown) and real-time learning (meaning that an efficient online implementation is required). We employ the general framework of approximate policy iteration with least-squares-based policy evaluation. As underlying function approximator we consider the family of regularization networks with subset of regressors approximation. The core of our proposed solution is an efficient recursive implementation with automatic supervised selection of relevant basis functions. Simulation results indicate that the behavior learned through our approach clearly outperforms the best results obtained earlier with tilecoding by Stone et al. (2005).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2017

Manifold Regularization for Kernelized LSTD

Policy evaluation or value function or Q-function approximation is a key...
research
01/31/2012

Feature Selection for Value Function Approximation Using Bayesian Model Selection

Feature selection in reinforcement learning (RL), i.e. choosing basis fu...
research
09/20/2021

A Reinforcement Learning Approach to the Stochastic Cutting Stock Problem

We propose a formulation of the stochastic cutting stock problem as a di...
research
07/04/2012

Representation Policy Iteration

This paper addresses a fundamental issue central to approximation method...
research
03/17/2023

A Policy Iteration Approach for Flock Motion Control

The flocking motion control is concerned with managing the possible conf...
research
01/27/2019

Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift

In this paper we revisit the method of off-policy corrections for reinfo...
research
01/13/2022

Recursive Least Squares Policy Control with Echo State Network

The echo state network (ESN) is a special type of recurrent neural netwo...

Please sign up or login with your details

Forgot password? Click here to reset