Reinforcement Learning in Deep Structured Teams: Initial Results with Finite and Infinite Valued Features

10/06/2020
by   Jalal Arabneydi, et al.
0

In this paper, we consider Markov chain and linear quadratic models for deep structured teams with discounted and time-average cost functions under two non-classical information structures, namely, deep state sharing and no sharing. In deep structured teams, agents are coupled in dynamics and cost functions through deep state, where deep state refers to a set of orthogonal linear regressions of the states. In this article, we consider a homogeneous linear regression for Markov chain models (i.e., empirical distribution of states) and a few orthonormal linear regressions for linear quadratic models (i.e., weighted average of states). Some planning algorithms are developed for the case when the model is known, and some reinforcement learning algorithms are proposed for the case when the model is not known completely. The convergence of two model-free (reinforcement learning) algorithms, one for Markov chain models and one for linear quadratic models, is established. The results are then applied to a smart grid.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2020

Reinforcement Learning in Linear Quadratic Deep Structured Teams: Global Convergence of Policy Gradient Methods

In this paper, we study the global convergence of model-based and model-...
research
01/28/2019

Markov-Modulated Linear Regression

Classical linear regression is considered for a case when regression par...
research
10/23/2021

Deep Structured Teams in Arbitrary-Size Linear Networks: Decentralized Estimation, Optimal Control and Separation Principle

In this article, we introduce decentralized Kalman filters for linear qu...
research
06/13/2012

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

We consider the problem of efficiently learning optimal control policies...
research
11/09/2020

Thompson sampling for linear quadratic mean-field teams

We consider optimal control of an unknown multi-agent linear quadratic (...
research
01/30/2021

On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

This paper studies the exponential stability of random matrix products d...
research
03/25/2020

Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence

This paper develops a unified framework, based on iterated random operat...

Please sign up or login with your details

Forgot password? Click here to reset