This work investigates the nuanced algorithm design choices for deep lea...
We study the problem of sequential prediction in the stochastic setting ...
Why do large language models sometimes output factual inaccuracies and
e...
We consider the well-studied problem of learning a linear combination of...
Algorithmic reasoning requires capabilities which are most naturally
und...
Neural Networks (NNs) struggle to efficiently learn certain problems, su...
There is mounting empirical evidence of emergent phenomena in the
capabi...
Contrastive learning is a popular form of self-supervised learning that
...
Intrinsic rewards play a central role in handling the
exploration-exploi...
Self-attention, an architectural motif designed to model long-range
inte...
We consider a general statistical estimation problem wherein binary labe...
Noise contrastive learning is a popular technique for unsupervised
repre...
There is an increasing need for effective active learning algorithms tha...
When balancing the practical tradeoffs of iterative methods for large-sc...
We prove several hardness results for training depth-2 neural networks w...
Graphical models are powerful tools for modeling high-dimensional data, ...
We give the first statistical-query lower bounds for agnostically learni...
We prove the first superpolynomial lower bounds for learning one-layer n...
We consider the fundamental problem of ReLU regression, where the goal i...
We study the problem of learning adversarially robust halfspaces in the
...
We consider the problem of computing the best-fitting ReLU with respect ...
We study the problem of learning graphical models with latent variables....
We consider the problem of learning the weighted edges of a mixture of t...
Giving provable guarantees for learning neural networks is a core challe...
Recent work has shown that additive threat models, which only permit the...
We give the first efficient algorithm for learning the structure of an I...
We propose a new algorithm to learn a one-hidden-layer convolutional neu...
We give the first provably efficient algorithm for learning a one hidden...
We give a polynomial-time algorithm for learning neural networks with on...