Jakob Uszkoreit

research

∙ 11/25/2021

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

A classical problem in computer vision is to infer a 3D scene representa...

0 Mehdi S. M. Sajjadi, et al. ∙

research

∙ 06/18/2021

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

Vision Transformers (ViT) have been shown to attain highly competitive p...

0 Andreas Steiner, et al. ∙

research

∙ 05/04/2021

MLP-Mixer: An all-MLP Architecture for Vision

Convolutional Neural Networks (CNNs) are the go-to model for computer vi...

18 Ilya Tolstikhin, et al. ∙

research

∙ 04/07/2021

Differentiable Patch Selection for Image Recognition

Neural Networks require large amounts of memory and compute to process h...

75 Jean-Baptiste Cordonnier, et al. ∙

research

∙ 10/22/2020

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

While the Transformer architecture has become the de-facto standard for ...

6 Alexey Dosovitskiy, et al. ∙

research

∙ 10/20/2020

Towards End-to-End In-Image Neural Machine Translation

In this paper, we offer a preliminary investigation into the task of in-...

0 Elman Mansimov, et al. ∙

research

∙ 10/29/2019

An Empirical Study of Generation Order for Machine Translation

In this work, we present an empirical study of generation order for mach...

0 William Chan, et al. ∙

research

∙ 06/06/2019

Scaling Autoregressive Video Models

Due to the statistical complexity of video, the high degree of inherent ...

6 Dirk Weissenborn, et al. ∙

research

∙ 06/04/2019

KERMIT: Generative Insertion-Based Modeling for Sequences

We present KERMIT, a simple insertion-based approach to generative model...

0 William Chan, et al. ∙

research

∙ 02/08/2019

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

We present the Insertion Transformer, an iterative, partially autoregres...

0 Mitchell Stern, et al. ∙

research

∙ 11/07/2018

Blockwise Parallel Decoding for Deep Autoregressive Models

Deep autoregressive sequence-to-sequence models have demonstrated impres...

0 Mitchell Stern, et al. ∙

research

∙ 09/12/2018

An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation

Music relies heavily on self-reference to build structure and meaning. W...

2 Cheng-Zhi Anna Huang, et al. ∙

research

∙ 09/12/2018

Music Transformer

Music relies heavily on repetition to build structure and meaning. Self-...

0 Cheng-Zhi Anna Huang, et al. ∙

research

∙ 07/10/2018

Universal Transformers

Self-attentive feed-forward sequence models have been shown to achieve i...

6 Mostafa Dehghani, et al. ∙

research

∙ 03/16/2018

Tensor2Tensor for Neural Machine Translation

Tensor2Tensor is a library for deep learning models that is well-suited ...

0 Ashish Vaswani, et al. ∙

research

∙ 03/09/2018

Fast Decoding in Sequence Models using Discrete Latent Variables

Autoregressive sequence models based on deep neural networks, such as RN...

0 Łukasz Kaiser, et al. ∙

research

∙ 03/06/2018

Self-Attention with Relative Position Representations

Relying entirely on an attention mechanism, the Transformer introduced b...

0 Peter Shaw, et al. ∙

research

∙ 02/15/2018

Image Transformer

Image generation has been successfully cast as an autoregressive sequenc...

0 Niki Parmar, et al. ∙

research

∙ 02/15/2018

Image Tranformer

Image generation has been successfully cast as an autoregressive sequenc...

0 Niki Parmar, et al. ∙

research

∙ 06/16/2017

One Model To Learn Them All

Deep learning yields great results across many fields, from speech recog...

0 Łukasz Kaiser, et al. ∙

research

∙ 06/12/2017

Attention Is All You Need

The dominant sequence transduction models are based on complex recurrent...

0 Ashish Vaswani, et al. ∙

research

∙ 04/15/2017

Neural Paraphrase Identification of Questions with Noisy Pretraining

We present a solution to the problem of paraphrase identification of que...

0 Gaurav Singh Tomar, et al. ∙

research

∙ 11/06/2016

Hierarchical Question Answering for Long Documents

We present a framework for question answering that can efficiently scale...

0 Eunsol Choi, et al. ∙

research

∙ 06/06/2016

A Decomposable Attention Model for Natural Language Inference

We propose a simple neural architecture for natural language inference. ...

0 Ankur P. Parikh, et al. ∙

Jakob Uszkoreit

Featured Co-authors

Sign in with Google

Consider DeepAI Pro