Scaling Directed Controller Synthesis via Reinforcement Learning
Directed Controller Synthesis technique finds solutions for the non-blocking property in discrete event systems by exploring a reduced portion of the exponentially big state space, using best-first search. Aiming to minimize the explored states, it is currently guided by a domain-independent handcrafted heuristic, with which it reaches state-of-the-art performance. In this work, we propose a new method for obtaining heuristics based on Reinforcement Learning. The synthesis algorithm is framed as an RL task with an unbounded action space and a modified version of DQN is used. With a simple and general set of features, we show that it is possible to learn heuristics on small versions of a problem in a way that generalizes to the larger instances. Our agents learn from scratch and outperform the existing heuristic overall, in instances unseen during training.
READ FULL TEXT