FutureGAN: Anticipating the Future Frames of Video Sequences using Spatio-Temporal 3d Convolutions in Progressively Growing Autoencoder GANs

10/02/2018
by   Sandra Aigner, et al.
14

We propose a new Autoencoder GAN model, FutureGAN, that predicts future frames of a video sequence given a sequence of past frames. Our approach extends the recently introduced progressive growing of GANs (PGGAN) architecture by Karras et al. [18]. During training, the resolution of the input and output frames is gradually increased by progressively adding layers in both the discriminator and the generator network. To learn representations that effectively capture the spatial and temporal components of a frame sequence, we use spatio-temporal 3d convolutions. We already achieve promising results for frame resolutions of 128 x 128 px over a variety of datasets ranging from synthetic to natural frame sequences, while theoretically not being limited to a specific frame resolution. The FutureGAN learns to generate plausible futures, learning representations that seem to effectively capture the spatial and the temporal transformations of the input frames. A great advantage of our architecture, in comparison to the majority of other video prediction models, is its simplicity. The model receives solely the raw pixel values as an input, generating output frames effectively, without relying on additional constraints, conditions, or complex pixel-based error loss metrics.

READ FULL TEXT

page 7

page 8

page 9

page 13

research
03/02/2022

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video

Recent transformer-based solutions have been introduced to estimate 3D h...
research
06/24/2019

LMVP: Video Predictor with Leaked Motion Information

We propose a Leaked Motion Video Predictor (LMVP) to predict future fram...
research
05/24/2019

From Here to There: Video Inbetweening Using Direct 3D Convolutions

We consider the problem of generating plausible and diverse video sequen...
research
04/10/2020

Multiresolution Convolutional Autoencoders

We propose a multi-resolution convolutional autoencoder (MrCAE) architec...
research
01/29/2017

Transformation-Based Models of Video Sequences

In this work we propose a simple unsupervised approach for next frame pr...
research
11/25/2020

Temporal Autoencoder with U-Net Style Skip-Connections for Frame Prediction

Finding sustainable and novel solutions to predict city-wide mobility be...
research
06/01/2018

Semi-Recurrent CNN-based VAE-GAN for Sequential Data Generation

A semi-recurrent hybrid VAE-GAN model for generating sequential data is ...

Please sign up or login with your details

Forgot password? Click here to reset