The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

07/15/2020
by   Alice Martin, et al.
65

This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a recurrent architecture. The keys, queries, values and attention vectors of the network are considered as the unobserved stochastic states of its hidden structure. This generative model is such that at each time step the received observation is a random function of these past states in a given attention window. In this general state-space setting, we use Sequential Monte Carlo methods to approximate the posterior distributions of the states given the observations, and then to estimate the gradient of the log-likelihood. We thus propose a generative model providing a predictive distribution, instead of a single-point estimate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2022

De-Sequentialized Monte Carlo: a parallel-in-time particle smoother

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed ...
research
05/13/2019

Replica Conditional Sequential Monte Carlo

We propose a Markov chain Monte Carlo (MCMC) scheme to perform state inf...
research
06/24/2019

Divide and Couple: Using Monte Carlo Variational Objectives for Posterior Approximation

Recent work in variational inference (VI) uses ideas from Monte Carlo es...
research
11/17/2019

State Space Emulation and Annealed Sequential Monte Carlo for High Dimensional Optimization

Many high dimensional optimization problems can be reformulated into a p...
research
02/22/2018

Deep learning algorithm for data-driven simulation of noisy dynamical system

We present a deep learning model, DE-LSTM, for the simulation of a stoch...
research
06/24/2011

Monte Carlo Methods for Tempo Tracking and Rhythm Quantization

We present a probabilistic generative model for timing deviations in exp...
research
10/05/2021

Deep Synoptic Monte Carlo Planning in Reconnaissance Blind Chess

This paper introduces deep synoptic Monte Carlo planning (DSMCP) for lar...

Please sign up or login with your details

Forgot password? Click here to reset