RAMario: Experimental Approach to Reptile Algorithm – Reinforcement Learning for Mario

by   Sanyam Jain, et al.

This research paper presents an experimental approach to using the Reptile algorithm for reinforcement learning to train a neural network to play Super Mario Bros. We implement the Reptile algorithm using the Super Mario Bros Gym library and TensorFlow in Python, creating a neural network model with a single convolutional layer, a flatten layer, and a dense layer. We define the optimizer and use the Reptile class to create an instance of the Reptile meta-learning algorithm. We train the model using multiple tasks and episodes, choosing actions using the current weights of the neural network model, taking those actions in the environment, and updating the model weights using the Reptile algorithm. We evaluate the performance of the algorithm by printing the total reward for each episode. In addition, we compare the performance of the Reptile algorithm approach to two other popular reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Deep Q-Network (DQN), applied to the same Super Mario Bros task. Our results demonstrate that the Reptile algorithm provides a promising approach to few-shot learning in video game AI, with comparable or even better performance than the other two algorithms, particularly in terms of moves vs distance that agent performs for 1M episodes of training. The results shows that best total distance for world 1-2 in the game environment were  1732 (PPO),  1840 (DQN) and  2300 (RAMario). Full code is available at https://github.com/s4nyam/RAMario.


SuperSuit: Simple Microwrappers for Reinforcement Learning Environments

In reinforcement learning, wrappers are universally used to transform th...

FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques

Reinforcement learning is one of the most popular approach for automated...

Chrome Dino Run using Reinforcement Learning

Reinforcement Learning is one of the most advanced set of algorithms kno...

Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess

In this work, we adapt a training approach inspired by the original Alph...

NARS vs. Reinforcement learning: ONA vs. Q-Learning

One of the realistic scenarios is taking a sequence of optimal actions t...

World Models

We explore building generative neural network models of popular reinforc...

Using Graph Convolutional Networks and TD(λ) to play the game of Risk

Risk is 6 player game with significant randomness and a large game-tree ...

Please sign up or login with your details

Forgot password? Click here to reset