Dynamic Evaluation of Neural Sequence Models

09/21/2017
by   Ben Krause, et al.
0

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2019

Dynamic Evaluation of Transformer Language Models

This research note combines two methods that have recently improved the ...
research
03/22/2018

An Analysis of Neural Language Modeling at Multiple Scales

Many of the leading approaches in language modeling introduce novel, com...
research
09/26/2016

Multiplicative LSTM for sequence modelling

We introduce multiplicative LSTM (mLSTM), a recurrent neural network arc...
research
11/14/2016

Attending to Characters in Neural Sequence Labeling Models

Sequence labeling architectures use word embeddings for capturing simila...
research
08/29/2019

PopEval: A Character-Level Approach to End-To-End Evaluation Compatible with Word-Level Benchmark Dataset

The most prevalent scope of interest for OCR applications used to be sca...
research
06/17/2016

DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs

This paper describes our approach for the Detecting Stance in Tweets tas...

Please sign up or login with your details

Forgot password? Click here to reset