Implicit Regularization in Matrix Sensing via Mirror Descent

by   Fan Wu, et al.

We study discrete-time mirror descent applied to the unregularized empirical risk in matrix sensing. In both the general case of rectangular matrices and the particular case of positive semidefinite matrices, a simple potential-based analysis in terms of the Bregman divergence allows us to establish convergence of mirror descent – with different choices of the mirror maps – to a matrix that, among all global minimizers of the empirical risk, minimizes a quantity explicitly related to the nuclear norm, the Frobenius norm, and the von Neumann entropy. In both cases, this characterization implies that mirror descent, a first-order algorithm minimizing the unregularized empirical risk, recovers low-rank matrices under the same set of assumptions that are sufficient to guarantee recovery for nuclear-norm minimization. When the sensing matrices are symmetric and commute, we show that gradient descent with full-rank factorized parametrization is a first-order approximation to mirror descent, in which case we obtain an explicit characterization of the implicit bias of gradient flow as a by-product.


page 1

page 2

page 3

page 4


Implicit regularization and solution uniqueness in over-parameterized matrix sensing

We consider whether algorithmic choices in over-parameterized linear mat...

Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning

Matrix factorization is a simple and natural test-bed to investigate the...

Gradient descent aligns the layers of deep linear networks

This paper establishes risk convergence and asymptotic weight matrix ali...

Rank-One Measurements of Low-Rank PSD Matrices Have Small Feasible Sets

We study the role of the constraint set in determining the solution to l...

Regularization-free estimation in trace regression with symmetric positive semidefinite matrices

Over the past few years, trace regression models have received considera...

On the computability of continuous maximum entropy distributions with applications

We initiate a study of the following problem: Given a continuous domain ...

Please sign up or login with your details

Forgot password? Click here to reset