Scaling Scaling Laws with Board Games

04/07/2021
by   Andy L. Jones, et al.
0

The largest experiments in machine learning now require resources far beyond the budget of all but a few institutions. Fortunately, it has recently been shown that the results of these huge experiments can often be extrapolated from the results of a sequence of far smaller, cheaper experiments. In this work, we show that not only can the extrapolation be done based on the size of the model, but on the size of the problem as well. By conducting a sequence of experiments using AlphaZero and Hex, we show that the performance achievable with a fixed amount of compute degrades predictably as the game gets larger and harder. Along with our main result, we further show that the test-time and train-time compute available to an agent can be traded off while maintaining performance.

READ FULL TEXT
research
09/24/2021

Is the Number of Trainable Parameters All That Actually Matters?

Recent work has identified simple empirical scaling laws for language mo...
research
06/11/2021

Scaling Laws for Acoustic Models

There is a recent trend in machine learning to increase model quality by...
research
07/18/2023

Scaling Laws for Imitation Learning in NetHack

Imitation Learning (IL) is one of the most widely used methods in machin...
research
05/22/2023

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

Scaling laws have been recently employed to derive compute-optimal model...
research
10/26/2022

Scaling Laws Beyond Backpropagation

Alternatives to backpropagation have long been studied to better underst...
research
11/03/2022

Inverse scaling can become U-shaped

Although scaling language models improves performance on a range of task...
research
11/18/2022

Path Independent Equilibrium Models Can Better Exploit Test-Time Computation

Designing networks capable of attaining better performance with an incre...

Please sign up or login with your details

Forgot password? Click here to reset