What is the gradient of a scalar function of a symmetric matrix ?
Perusal of research articles that deal with the topic of matrix calculus reveal two different approaches to calculation of the gradient of a real-valued function of a symmetric matrix leading to two different results. In the mechanics and physics communities, the gradient is calculated using the definition of a derivative, irrespective of whether the argument is symmetric or not. However, members of the statistics, economics, and electrical engineering communities use another notion of the gradient that explicitly takes into account the symmetry of the matrix, and this "symmetric gradient" G_s is reported to be related to the gradient G computed from the derivative with respect to a general matrix as G_s = G + G^T - G ∘ I, where ∘ denotes the elementwise Hadamard product of the two matrices. We demonstrate that this relation is incorrect, and reconcile both these viewpoints by proving that G_s = sym(G).
READ FULL TEXT