b'Eran Malach'

research

∙ 09/13/2023

Auto-Regressive Next-Token Predictors are Universal Learners

Large language models display remarkable capabilities in logical and mat...

0 Eran Malach, et al. ∙

research

∙ 09/07/2023

Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

This work investigates the nuanced algorithm design choices for deep lea...

0 Benjamin L. Edelman, et al. ∙

research

∙ 09/04/2023

Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data Shuffling For SGD

When using Stochastic Gradient Descent (SGD) for training machine learni...

0 Etay Livne, et al. ∙

research

∙ 02/13/2023

SubTuning: Efficient Finetuning for Multi-Task Learning

Finetuning a pretrained model has become a standard approach for trainin...

0 Gal Kaplun, et al. ∙

research

∙ 07/18/2022

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

There is mounting empirical evidence of emergent phenomena in the capabi...

0 Boaz Barak, et al. ∙

research

∙ 03/28/2022

Knowledge Distillation: Bad Models Can Be Good Role Models

Large neural networks trained in the overparameterized regime are able t...

0 Gal Kaplun, et al. ∙

research

∙ 08/09/2021

On the Power of Differentiable Learning versus PAC and SQ Learning

We study the power of learning via mini-batch stochastic gradient descen...

0 Emmanuel Abbe, et al. ∙

research

∙ 03/01/2021

Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels

We study the relative power of learning with gradient descent on differe...

0 Eran Malach, et al. ∙

research

∙ 01/31/2021

The Connection Between Approximation, Depth Separation and Learnability in Neural Networks

Several recent works have shown separation results between deep neural n...

0 Eran Malach, et al. ∙

research

∙ 10/03/2020

Computational Separation Between Convolutional and Fully-Connected Networks

Convolutional neural networks (CNN) exhibit unmatched performance in a m...

9 Eran Malach, et al. ∙

research

∙ 08/18/2020

When Hardness of Approximation Meets Hardness of Learning

A supervised learning algorithm has access to a distribution of labeled ...

11 Eran Malach, et al. ∙

research

∙ 02/18/2020

Learning Parities with Neural Networks

In recent years we see a rapidly growing line of research which shows le...

2 Amit Daniely, et al. ∙

research

∙ 02/03/2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need

The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a ...

7 Eran Malach, et al. ∙

research

∙ 10/25/2019

Learning Boolean Circuits with Neural Networks

Training neural-networks is computationally hard. However, in practice t...

0 Eran Malach, et al. ∙

research

∙ 07/11/2019

On the Optimality of Trees Generated by ID3

Since its inception in the 1980s, ID3 has become one of the most success...

0 Alon Brutzkus, et al. ∙

research

∙ 06/20/2019

ID3 Learns Juntas for Smoothed Product Distributions

In recent years, there are many attempts to understand popular heuristic...

0 Alon Brutzkus, et al. ∙

research

∙ 06/12/2019

Decoupling Gating from Linearity

ReLU neural-networks have been in the focus of many recent theoretical w...

0 Jonathan Fiat, et al. ∙

research

∙ 03/08/2019

Is Deeper Better only when Shallow is Good?

Understanding the power of depth in feed-forward neural networks is an o...

0 Eran Malach, et al. ∙

research

∙ 03/26/2018

A Provably Correct Algorithm for Deep Learning that Actually Works

We describe a layer-by-layer algorithm for training deep convolutional n...

0 Eran Malach, et al. ∙

Eran Malach

Featured Co-authors

Sign in with Google

Consider DeepAI Pro