Learning Scalable Policies over Graphs for Multi-Robot Task Allocation using Capsule Attention Networks

by   Steve Paul, et al.

This paper presents a novel graph reinforcement learning (RL) architecture to solve multi-robot task allocation (MRTA) problems that involve tasks with deadlines and workload, and robot constraints such as work capacity. While drawing motivation from recent graph learning methods that learn to solve combinatorial optimization (CO) problems such as multi-Traveling Salesman and Vehicle Routing Problems using RL, this paper seeks to provide better performance (compared to non-learning methods) and important scalability (compared to existing learning architectures) for the stated class of MRTA problems. The proposed neural architecture, called Capsule Attention-based Mechanism or CapAM acts as the policy network, and includes three main components: 1) an encoder: a Capsule Network based node embedding model to represent each task as a learnable feature vector; 2) a decoder: an attention-based model to facilitate a sequential output; and 3) context: that encodes the states of the mission and the robots. To train the CapAM model, the policy-gradient method based on REINFORCE is used. When evaluated over unseen scenarios, CapAM demonstrates better task completion performance and >10 times faster decision-making compared to standard non-learning based online MRTA methods. CapAM's advantage in generalizability, and scalability to test problems of size larger than those used in training, are also successfully demonstrated in comparison to a popular approach for learning to solve CO problems, namely the purely attention mechanism.


page 1

page 2

page 3

page 4


Efficient Planning of Multi-Robot Collective Transport using Graph Reinforcement Learning with Higher Order Topological Abstraction

Efficient multi-robot task allocation (MRTA) is fundamental to various t...

A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Reinforcement learning has recently shown promise in learning quality so...

Transformer Network-based Reinforcement Learning Method for Power Distribution Network (PDN) Optimization of High Bandwidth Memory (HBM)

In this article, for the first time, we propose a transformer network-ba...

Scalable and transferable learning of algorithms via graph embedding for multi-robot reward collection

Can the success of reinforcement learning methods for combinatorial opti...

Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding

Can the success of reinforcement learning methods for simple combinatori...

Integrated Decision and Control for High-Level Automated Vehicles by Mixed Policy Gradient and Its Experiment Verification

Self-evolution is indispensable to realize full autonomous driving. This...

Transferable Graph Optimizers for ML Compilers

Most compilers for machine learning (ML) frameworks need to solve many c...

Please sign up or login with your details

Forgot password? Click here to reset