In this work, we introduce Boolformer, the first Transformer architectur...
Experimental results have shown that curriculum learning, i.e., presenti...
We identify incremental learning dynamics in transformers, where the
dif...
Reed-Muller codes were introduced in 1954, with a simple explicit
constr...
We investigate the time complexity of SGD learning on fully-connected ne...
This paper considers the learning of logical (Boolean) functions with fo...
We prove computational limitations for learning with neural networks tra...
This paper considers the Pointer Value Retrieval (PVR) benchmark introdu...
This paper introduces the notion of "Initial Alignment" (INAL) between a...
It is currently known how to characterize functions that neural networks...
It was recently shown that almost all solutions in the symmetric binary
...
This paper identifies a structural property of data distributions that
e...
We study the power of learning via mini-batch stochastic gradient descen...
We study the relative power of learning with gradient descent on
differe...
We consider the symmetric binary perceptron model, a simple model of neu...
The limit of the entropy in the stochastic block model (SBM) has been
ch...
A well-known result across information theory, machine learning, and
sta...
Principal Component Analysis (PCA) is a powerful tool in statistics and
...
The r-th power of a graph modifies a graph by connecting every vertex pa...
The problem of learning graphons has attracted considerable attention ac...
This paper introduces a model for opinion dynamics, where at each time s...
This paper considers 'δ-almost Reed-Muller codes', i.e., linear codes
sp...
Reed-Muller (RM) codes are among the oldest, simplest and perhaps most
u...
The goal of this paper is to characterize function distributions that de...
This paper investigates entropic matroids, that is, matroids whose rank
...
We derive generalization and excess risk bounds for neural nets using a
...
In 2000, Evans et al. [Eva+00] proved the subadditivity of the mutual
in...
We propose a new class of efficient decoding algorithms for Reed-Muller ...
Reed-Muller (RM) codes and polar codes are generated by the same matrix ...
As the success of deep learning reaches more grounds, one would like to ...
Spectral algorithms, such as principal component analysis and spectral
c...
Bounding the generalization error of learning algorithms has a long hist...
This paper considers the problem of reconstructing n independent uniform...
This paper develops coding techniques to reduce the running time of
dist...
We analyze the problem of estimating a signal from multiple measurements...
This paper develops upper and lower bounds on the influence measure in a...
The stochastic block model (SBM) is a random graph model with planted
cl...