Learning Navigation Costs from Demonstration in Partially Observable Environments

by   Tianyu Wang, et al.

This paper focuses on inverse reinforcement learning (IRL) to enable safe and efficient autonomous navigation in unknown partially observable environments. The objective is to infer a cost function that explains expert-demonstrated navigation behavior while relying only on the observations and state-control trajectory used by the expert. We develop a cost function representation composed of two parts: a probabilistic occupancy encoder, with recurrent dependence on the observation sequence, and a cost encoder, defined over the occupancy features. The representation parameters are optimized by differentiating the error between demonstrated controls and a control policy computed from the cost encoder. Such differentiation is typically computed by dynamic programming through the value function over the whole state space. We observe that this is inefficient in large partially observable environments because most states are unexplored. Instead, we rely on a closed-form subgradient of the cost-to-go obtained only over a subset of promising states via an efficient motion-planning algorithm such as A* or RRT. Our experiments show that our model exceeds the accuracy of baseline IRL algorithms in robot navigation tasks, while substantially improving the efficiency of training and test-time inference.


page 1

page 3

page 6


Learning Navigation Costs from Demonstration with Semantic Observations

This paper focuses on inverse reinforcement learning (IRL) for autonomou...

Inverse reinforcement learning for autonomous navigation via differentiable semantic mapping and planning

This paper focuses on inverse reinforcement learning for autonomous navi...

Model Predictive Path Integral Control Framework for Partially Observable Navigation: A Quadrotor Case Study

Recently, Model Predictive Path Integral (MPPI) control algorithm has be...

A genetic algorithm for autonomous navigation in partially observable domain

The problem of autonomous navigation is one of the basic problems for ro...

Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation

We propose to take a novel approach to robot system design where each bu...

A Robustness Analysis of Inverse Optimal Control of Bipedal Walking

Cost functions have the potential to provide compact and understandable ...

Partially Observable Planning and Learning for Systems with Non-Uniform Dynamics

We propose a neural network architecture, called TransNet, that combines...

Please sign up or login with your details

Forgot password? Click here to reset