Monte Carlo Forest Search: UNSAT Solver Synthesis via Reinforcement learning
We introduce Monte Carlo Forest Search (MCFS), an offline algorithm for automatically synthesizing strong tree-search solvers for proving unsatisfiability on given distributions, leveraging ideas from the Monte Carlo Tree Search (MCTS) algorithm that led to breakthroughs in AlphaGo. The crucial difference between proving unsatisfiability and existing applications of MCTS, is that policies produce trees rather than paths. Rather than finding a good path (solution) within a tree, the search problem becomes searching for a small proof tree within a forest of candidate proof trees. We introduce two key ideas to adapt to this setting. First, we estimate tree size with paths, via the unbiased approximation from Knuth (1975). Second, we query a strong solver at a user-defined depth rather than learning a policy across the whole tree, in order to focus our policy search on early decisions, which offer the greatest potential for reducing tree size. We then present MCFS-SAT, an implementation of MCFS for learning branching policies for solving the Boolean satisfiability (SAT) problem that required many modifications from AlphaGo. We matched or improved performance over a strong baseline on two well-known SAT distributions (, ). Notably, we improved running time by 9% on over the solver and even further over the strongest UNSAT solver from the 2021 SAT competition.
READ FULL TEXT 
  
  
     share
 share