Dynamic Automatic Differentiation of GPU Broadcast Kernels

10/18/2018
by   Jarrett Revels, et al.
0

We show how forward-mode automatic differentiation (AD) can be employed within larger reverse-mode computations to dynamically differentiate broadcast operations in a GPU-friendly manner. Our technique fully exploits the broadcast Jacobian's inherent sparsity structure, and unlike a pure reverse-mode approach, this "mixed-mode" approach does not require a backwards pass over the broadcasted operation's subgraph, obviating the need for several reverse-mode-specific programmability restrictions on user-authored broadcast operations. Most notably, this approach allows broadcast fusion in primal code despite the presence of data-dependent control flow. We discuss an experiment in which a Julia implementation of our technique outperformed pure reverse-mode TensorFlow and Julia implementations for differentiating through broadcast operations within an HM-LSTM cell update calculation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2021

Decomposing reverse-mode automatic differentiation

We decompose reverse-mode automatic differentiation into (forward-mode) ...
research
12/14/2021

Verifying a Minimalist Reverse-Mode AD Library

By exploiting a number of relatively subtle programming language feature...
research
02/23/2021

Event-Based Automatic Differentiation of OpenMP with OpDiLib

We present the new software OpDiLib, a universal add-on for classical op...
research
12/19/2022

Denotationally Correct, Purely Functional, Efficient Reverse-mode Automatic Differentiation

Reverse-mode differentiation is used for optimization, but it introduces...
research
04/22/2022

You Only Linearize Once: Tangents Transpose to Gradients

Automatic differentiation (AD) is conventionally understood as a family ...
research
12/10/2022

Optimized Sparse Matrix Operations for Reverse Mode Automatic Differentiation

Sparse matrix representations are ubiquitous in computational science an...
research
08/22/2017

Divide-and-Conquer Checkpointing for Arbitrary Programs with No User Annotation

Classical reverse-mode automatic differentiation (AD) imposes only a sma...

Please sign up or login with your details

Forgot password? Click here to reset