The Sharpness Aware Minimization (SAM) optimization algorithm has been s...
Recent studies of gradient descent with large step sizes have shown that...
Deep equilibrium networks (DEQs) are a promising way to construct models...
Can deep learning solve multiple tasks simultaneously, even when they ar...
The softmax function combined with a cross-entropy loss is a principled
...
Large neural network models have been successful in learning functions o...