Improving the Neural GPU Architecture for Algorithm Learning

02/28/2017
by   Karlis Freivalds, et al.
0

Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep learning methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being the Neural GPU, capable of learning multiplication. We present several improvements to the Neural GPU that substantially reduces training time and improves generalization. We introduce a technique of general applicability to use hard nonlinearities with saturation cost. We also introduce a technique of diagonal gates that can be applied to active-memory models. The proposed architecture is the first capable of learning decimal multiplication end-to-end.

READ FULL TEXT

page 3

page 9

page 10

research
05/19/2020

Out-of-Core GPU Gradient Boosting

GPU-based algorithms have greatly accelerated many machine learning meth...
research
11/25/2015

Neural GPUs Learn Algorithms

Learning an algorithm from examples is a fundamental problem that has be...
research
03/21/2022

Human-Centric Artificial Intelligence Architecture for Industry 5.0 Applications

Human-centricity is the core value behind the evolution of manufacturing...
research
08/30/2021

MultPIM: Fast Stateful Multiplication for Processing-in-Memory

Processing-in-memory (PIM) seeks to eliminate computation/memory data tr...
research
02/02/2022

Giga-scale Kernel Matrix Vector Multiplication on GPU

Kernel matrix-vector multiplication (KMVM) is a foundational operation i...
research
04/16/2015

Caffe con Troll: Shallow Ideas to Speed Up Deep Learning

We present Caffe con Troll (CcT), a fully compatible end-to-end version ...
research
11/02/2016

Extensions and Limitations of the Neural GPU

The Neural GPU is a recent model that can learn algorithms such as multi...

Please sign up or login with your details

Forgot password? Click here to reset