Intelligent Knowledge Distribution: Constrained-Action POMDPs for Resource-Aware Multi-Agent Communication

by   Michael C. Fowler, et al.

This paper addresses a fundamental question of multi-agent knowledge distribution: what information should be sent to whom and when, with the limited resources available to each agent? Communication requirements for multi-agent systems can be rather high when an accurate picture of the environment and the state of other agents must be maintained. To reduce the impact of multi-agent coordination on networked systems, e.g., power and bandwidth, this paper introduces two concepts for partially observable Markov decision processes (POMDPs): 1) action-based constraints which yield constrained-action POMDPs (CA-POMDPs); and 2) soft probabilistic constraint satisfaction for the resulting infinite-horizon controllers. To enable constraint analysis over an infinite horizon, an unconstrained policy is first represented as a Finite State Controller (FSC) and optimized with policy iteration. The FSC representation then allows for a combination of Markov chain Monte Carlo and discrete optimization to improve the probabilistic constraint satisfaction of the controller while minimizing the impact to the value function. Within the CA-POMDP framework we then propose Intelligent Knowledge Distribution (IKD) which yields per-agent policies for distributing knowledge between agents subject to interaction constraints. Finally, the CA-POMDP and IKD concepts are validated using an asset tracking problem where multiple unmanned aerial vehicles (UAVs) with heterogeneous sensors collaborate to localize a ground asset to assist in avoiding unseen obstacles in a disaster area. The IKD model was able to maintain asset tracking through multi-agent communications while only violating soft power and bandwidth constraints 3 the time, while greedy and naive approaches violated constraints more than 60 of the time.


page 1

page 5

page 12

page 14


Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes

Autonomous systems often have logical constraints arising, for example, ...

Learning to Communicate Using Counterfactual Reasoning

This paper introduces a new approach for multi-agent communication learn...

Multi-Agent Common Knowledge Reinforcement Learning

In multi-agent reinforcement learning, centralised policies can only be ...

QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

The paper considers a class of multi-agent Markov decision processes (MD...

Diffusion Based Multi-Agent Adversarial Tracking

Target tracking plays a crucial role in real-world scenarios, particular...

Discrete-choice Multi-agent Optimization: Decentralized Hard Constraint Satisfaction for Smart Cities

Making Smart Cities more sustainable, resilient and democratic is emergi...

Multi-Objective Multi-Agent Planning for Discovering and Tracking Unknown and Varying Number of Mobile Objects

We consider the online planning problem for a team of agents to discover...

Please sign up or login with your details

Forgot password? Click here to reset