ResMem: Learn what you can and memorize the rest

by   Zitong Yang, et al.

The impressive generalization performance of modern neural networks is attributed in part to their ability to implicitly memorize complex training patterns. Inspired by this, we explore a novel mechanism to improve model generalization via explicit memorization. Specifically, we propose the residual-memorization (ResMem) algorithm, a new method that augments an existing prediction model (e.g. a neural network) by fitting the model's residuals with a k-nearest neighbor based regressor. The final prediction is then the sum of the original model and the fitted residual regressor. By construction, ResMem can explicitly memorize the training labels. Empirically, we show that ResMem consistently improves the test set generalization of the original prediction model across various standard vision and natural language processing benchmarks. Theoretically, we formulate a stylized linear regression problem and rigorously show that ResMem results in a more favorable test risk over the base predictor.


A combination of 'pooling' with a prediction model can reduce by 73 number of COVID-19 (Corona-virus) tests

We show that combining a prediction model (based on neural networks), wi...

N^4 -Fields: Neural Network Nearest Neighbor Fields for Image Transforms

We propose a new architecture for difficult image processing operations,...

A Model of One-Shot Generalization

We provide a theoretical framework to study a phenomenon that we call on...

Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks

The lottery ticket hypothesis (LTH) states that learning on a properly p...

k-Neighbor Based Curriculum Sampling for Sequence Prediction

Multi-step ahead prediction in language models is challenging due to the...

Please sign up or login with your details

Forgot password? Click here to reset