Exploring the Memorization-Generalization Continuum in Deep Learning

02/08/2020
by   Ziheng Jiang, et al.
9

Human learners appreciate that some facts demand memorization whereas other facts support generalization. For example, English verbs have irregular cases that must be memorized (e.g., go->went) and regular cases that generalize well (e.g., kiss->kissed, miss->missed). Likewise, deep neural networks have the capacity to memorize rare or irregular forms but nonetheless generalize across instances that share common patterns or structures. We analyze how individual instances are treated by a model on the memorization-generalization continuum via a consistency score. The score is the expected accuracy of a particular architecture for a held-out instance on a training set of a fixed size sampled from the data distribution. We obtain empirical estimates of this score for individual instances in multiple datasets, and we show that the score identifies out-of-distribution and mislabeled examples at one end of the continuum and regular examples at the other end. We explore three proxies to the consistency score: kernel density estimation on input and hidden representations; and the time course of training, i.e., learning speed. In addition to helping to understand the memorization versus generalization dynamics during training, the C-score proxies have potential application for out-of-distribution detection, curriculum learning, and active data collection.

READ FULL TEXT

page 2

page 5

page 6

page 13

page 14

research
01/03/2023

Data Valuation Without Training of a Model

Many recent works on understanding deep learning try to quantify how muc...
research
07/07/2022

A Study on the Predictability of Sample Learning Consistency

Curriculum Learning is a powerful training method that allows for faster...
research
10/07/2022

Out-of-Distribution Generalization in Algorithmic Reasoning Through Curriculum Learning

Out-of-distribution generalization (OODG) is a longstanding challenge fo...
research
02/20/2022

Understanding Robust Generalization in Learning Regular Languages

A key feature of human intelligence is the ability to generalize beyond ...
research
02/14/2016

Hi Detector, What's Wrong with that Object? Identifying Irregular Object From Images by Modelling the Detection Score Distribution

In this work, we study the challenging problem of identifying the irregu...
research
07/15/2021

Deep Learning on a Data Diet: Finding Important Examples Early in Training

The recent success of deep learning has partially been driven by trainin...

Please sign up or login with your details

Forgot password? Click here to reset