On the accuracy of self-normalized log-linear models

06/12/2015
by   Jacob Andreas, et al.
0

Calculation of the log-normalizer is a major computational obstacle in applications of log-linear models with large output spaces. The problem of fast normalizer computation has therefore attracted significant attention in the theoretical and applied machine learning literature. In this paper, we analyze a recently proposed technique known as "self-normalization", which introduces a regularization term in training to penalize log normalizers for deviating from zero. This makes it possible to use unnormalized model scores as approximate probabilities. Empirical evidence suggests that self-normalization is extremely effective, but a theoretical understanding of why it should work, and how generally it can be applied, is largely lacking. We prove generalization bounds on the estimated variance of normalizers and upper bounds on the loss in accuracy due to self-normalization, describe classes of input distributions that self-normalize easily, and construct explicit examples of high-variance input distributions. Our theoretical results make predictions about the difficulty of fitting self-normalized models to several classes of distributions, and we conclude with empirical validation of these predictions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2022

BR-SNIS: Bias Reduced Self-Normalized Importance Sampling

Importance Sampling (IS) is a method for approximating expectations unde...
research
06/04/2018

Self-Normalization Properties of Language Modeling

Self-normalizing discriminative models approximate the normalized probab...
research
05/22/2023

Fast Convergence in Learning Two-Layer Neural Networks with Separable Data

Normalized gradient descent has shown substantial success in speeding up...
research
10/07/2020

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Self-training algorithms, which train a model to fit pseudolabels predic...
research
04/21/2023

A Revisit to the Normalized Eight-Point Algorithm and A Self-Supervised Deep Solution

The Normalized Eight-Point algorithm has been widely viewed as the corne...
research
04/04/2022

A Bootstrap-Assisted Self-Normalization Approach to Inference in Cointegrating Regressions

Traditional inference in cointegrating regressions requires tuning param...

Please sign up or login with your details

Forgot password? Click here to reset