The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks

06/30/2023
by   Ziqian Zhong, et al.
0

Do neural networks, trained on well-understood algorithmic tasks, reliably rediscover known algorithms for solving those tasks? Several recent studies, on tasks ranging from group arithmetic to in-context linear regression, have suggested that the answer is yes. Using modular addition as a prototypical problem, we show that algorithm discovery in neural networks is sometimes more complex. Small changes to model hyperparameters and initializations can induce the discovery of qualitatively different algorithms from a fixed training set, and even parallel implementations of multiple such algorithms. Some networks trained to perform modular addition implement a familiar Clock algorithm; others implement a previously undescribed, less intuitive, but comprehensible procedure which we term the Pizza algorithm, or a variety of even more complex procedures. Our results show that even simple learning problems can admit a surprising diversity of solutions, motivating the development of new tools for characterizing the behavior of neural networks across their algorithmic phase space.

READ FULL TEXT

page 4

page 6

page 15

page 17

page 19

page 22

page 24

page 25

research
07/24/2019

Knowledge transfer in deep block-modular neural networks

Although deep neural networks (DNNs) have demonstrated impressive result...
research
01/26/2023

Break It Down: Evidence for Structural Compositionality in Neural Networks

Many tasks can be described as compositions over subroutines. Though mod...
research
06/08/2021

Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks

Deep neural networks are powerful machines for visual pattern recognitio...
research
01/26/2023

Neural networks learn to magnify areas near decision boundaries

We study how training molds the Riemannian geometry induced by neural ne...
research
11/27/2017

Population Based Training of Neural Networks

Neural networks dominate the modern machine learning landscape, but thei...
research
04/29/2022

Wide and Deep Neural Networks Achieve Optimality for Classification

While neural networks are used for classification tasks across domains, ...
research
10/10/2022

Classifying topoi in synthetic guarded domain theory

Several different topoi have played an important role in the development...

Please sign up or login with your details

Forgot password? Click here to reset