gym-saturation: Gymnasium environments for saturation provers (System description)

09/16/2023
by   Boris Shminke, et al.
0

This work describes a new version of a previously published Python package - gym-saturation: a collection of OpenAI Gym environments for guiding saturation-style provers based on the given clause algorithm with reinforcement learning. We contribute usage examples with two different provers: Vampire and iProver. We also have decoupled the proof state representation from reinforcement learning per se and provided examples of using a known ast2vec Python code embedding model as a first-order logic representation. In addition, we demonstrate how environment wrappers can transform a prover into a problem similar to a multi-armed bandit. We applied two reinforcement learning algorithms (Thompson sampling and Proximal policy optimisation) implemented in Ray RLlib to show the ease of experimentation with the new release of our package.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2022

Project proposal: A modular reinforcement learning based automated theorem prover

We propose to build a reinforcement learning prover of independent compo...
research
08/31/2018

APES: a Python toolbox for simulating reinforcement learning environments

Assisted by neural networks, reinforcement learning agents have been abl...
research
10/19/2022

DIAMBRA Arena: a New Reinforcement Learning Platform for Research and Experimentation

The recent advances in reinforcement learning have led to effective meth...
research
11/11/2022

pyRDDLGym: From RDDL to Gym Environments

We present pyRDDLGym, a Python framework for auto-generation of OpenAI G...
research
05/15/2018

Graph Signal Sampling via Reinforcement Learning

We formulate the problem of sampling and recovering clustered graph sign...
research
05/19/2022

Parallel bandit architecture based on laser chaos for reinforcement learning

Accelerating artificial intelligence by photonics is an active field of ...
research
08/28/2023

Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted Averages

The multi-armed bandit (MAB) problem is a classical problem that models ...

Please sign up or login with your details

Forgot password? Click here to reset