Antonio Orvieto

research

∙ 07/21/2023

On the Universality of Linear Recurrences Followed by Nonlinear Projections

In this note (work in progress towards a full-length paper) we show that...

0 Antonio Orvieto, et al. ∙

research

∙ 03/16/2023

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning

In contrast to the natural capabilities of humans to learn new tasks in ...

0 Sanghwan Kim, et al. ∙

research

∙ 03/11/2023

Resurrecting Recurrent Neural Networks for Long Sequences

Recurrent Neural Networks (RNNs) offer fast inference on long sequences ...

10 Antonio Orvieto, et al. ∙

research

∙ 01/19/2023

An SDE for Modeling SAM: Theory and Insights

We study the SAM (Sharpness-Aware Minimization) optimizer which has rece...

0 Enea Monzio Compagnoni, et al. ∙

research

∙ 09/19/2022

On the Theoretical Properties of Noise Correlation in Stochastic Optimization

Studying the properties of stochastic noise to optimize complex non-conv...

0 Aurelien Lucchi, et al. ∙

research

∙ 06/09/2022

Explicit Regularization in Overparametrized Models via Noise Injection

Injecting noise within gradient descent has several desirable features. ...

0 Antonio Orvieto, et al. ∙

research

∙ 06/07/2022

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Transformers have achieved remarkable success in several domains, rangin...

0 Lorenzo Noci, et al. ∙

research

∙ 02/06/2022

Anticorrelated Noise Injection for Improved Generalization

Injecting artificial noise into gradient descent (GD) is commonly employ...

0 Antonio Orvieto, et al. ∙

research

∙ 01/02/2022

Randomized Signature Layers for Signal Extraction in Time Series Data

Time series analysis is a widespread task in Natural Sciences, Social Sc...

0 Enea Monzio Compagnoni, et al. ∙

research

∙ 12/10/2021

Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity

Gradient descent ascent (GDA), the simplest single-loop algorithm for no...

0 Junchi Yang, et al. ∙

research

∙ 10/25/2021

On the Second-order Convergence Properties of Random Search Methods

We study the theoretical convergence properties of random-search methods...

0 Aurelien Lucchi, et al. ∙

research

∙ 02/23/2021

Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization

Viewing optimization methods as numerical integrators for ordinary diffe...

0 Peiyuan Zhang, et al. ∙

research

∙ 09/01/2020

Learning explanations that are hard to vary

In this paper, we investigate the principle that `good explanations are ...

40 Giambattista Parascandolo, et al. ∙

research

∙ 07/07/2020

An Accelerated DFO Algorithm for Finite-sum Convex Functions

Derivative-free optimization (DFO) has recently gained a lot of momentum...

0 Yuwen Chen, et al. ∙

research

∙ 11/12/2019

Shadowing Properties of Optimization Algorithms

Ordinary differential equation (ODE) models of gradient-based optimizati...

0 Antonio Orvieto, et al. ∙

research

∙ 07/02/2019

The Role of Memory in Stochastic Optimization

The choice of how to retain information about past gradients dramaticall...

6 Antonio Orvieto, et al. ∙

research

∙ 10/05/2018

Continuous-time Models for Stochastic Optimization Algorithms

We propose a new continuous-time formulation for first-order stochastic ...

0 Antonio Orvieto, et al. ∙

Antonio Orvieto

Featured Co-authors

Sign in with Google

Consider DeepAI Pro