Scene Transformer: A unified multi-task model for behavior prediction and planning

by   Jiquan Ngiam, et al.

Predicting the future motion of multiple agents is necessary for planning in dynamic environments. This task is challenging for autonomous driving since agents (e.g., vehicles and pedestrians) and their associated behaviors may be diverse and influence each other. Most prior work has focused on first predicting independent futures for each agent based on all past motion, and then planning against these independent predictions. However, planning against fixed predictions can suffer from the inability to represent the future interaction possibilities between different agents, leading to sub-optimal planning. In this work, we formulate a model for predicting the behavior of all agents jointly in real-world driving environments in a unified manner. Inspired by recent language modeling approaches, we use a masking strategy as the query to our model, enabling one to invoke a single model to predict agent behavior in many ways, such as potentially conditioned on the goal or full future trajectory of the autonomous vehicle or the behavior of other agents in the environment. Our model architecture fuses heterogeneous world state in a unified Transformer architecture by employing attention across road elements, agent interactions and time steps. We evaluate our approach on autonomous driving datasets for behavior prediction, and achieve state-of-the-art performance. Our work demonstrates that formulating the problem of behavior prediction in a unified architecture with a masking strategy may allow us to have a single model that can perform multiple motion prediction and planning related tasks effectively.


PiP: Planning-informed Trajectory Prediction for Autonomous Driving

It is critical to predict the motion of surrounding vehicles for self-dr...

ScePT: Scene-consistent, Policy-based Trajectory Predictions for Planning

Trajectory prediction is a critical functionality of autonomous systems ...

Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions

We focus on the problem of predicting future states of entities in compl...

CVAE-H: Conditionalizing Variational Autoencoders via Hypernetworks and Trajectory Forecasting for Autonomous Driving

The task of predicting stochastic behaviors of road agents in diverse en...

MARC: Multipolicy and Risk-aware Contingency Planning for Autonomous Driving

Generating safe and non-conservative behaviors in dense, dynamic environ...

MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying

Motion prediction is crucial for autonomous driving systems to understan...

Human Motion Trajectory Prediction: A Survey

With growing numbers of intelligent systems in human environments, the a...

Please sign up or login with your details

Forgot password? Click here to reset