Continual Reinforcement Learning with Group Symmetries
Continual reinforcement learning (RL) aims to learn a sequence of tasks while retaining the capability to solve seen tasks and growing a new policy to solve novel tasks. Existing continual RL methods ignore that some tasks are equivalent under simple group operations, such as rotations or translations. They thus extend a new policy for each equivalent task and train the policy from scratch, resulting in poor sample complexity and generalization capability. In this work, we propose a novel continual RL framework with group symmetries, which grows a policy for each group of equivalent tasks instead of a single task. We introduce a PPO-based RL algorithm with an invariant feature extractor and a novel task grouping mechanism based on invariant features. We test our algorithm in realistic autonomous driving scenarios, where each group is associated with a map configuration. We show that our algorithm assigns tasks to different groups with high accuracy and outperforms baselines in terms of generalization capability by a large margin.
READ FULL TEXT