We identify incremental learning dynamics in transformers, where the
dif...
We study when the neural tangent kernel (NTK) approximation is valid for...
We investigate the time complexity of SGD learning on fully-connected ne...
Comparing the representations learned by different neural networks has
r...
We prove computational limitations for learning with neural networks tra...
It is currently known how to characterize functions that neural networks...
This paper identifies a structural property of data distributions that
e...
We consider the problem of learning a tree-structured Ising model from d...
The problem of computing Wasserstein barycenters (a.k.a. Optimal Transpo...
Multimarginal Optimal Transport (MOT) is the problem of linear programmi...
Multimarginal Optimal Transport (MOT) has recently attracted significant...
Computing Wasserstein barycenters is a fundamental geometric problem wit...
We initiate the study of the natural multiplayer generalization of the
c...
The determinant can be computed by classical circuits of depth O(log^2 n...
The complexity of clique problems on Erdos-Renyi random graphs has becom...
In 2000, Evans et al. [Eva+00] proved the subadditivity of the mutual
in...