Understanding Loss Landscapes of Neural Network Models in Solving Partial Differential Equations

by   Keke Wu, et al.

Solving partial differential equations (PDEs) by parametrizing its solution by neural networks (NNs) has been popular in the past a few years. However, different types of loss functions can be proposed for the same PDE. For the Poisson equation, the loss function can be based on the weak formulation of energy variation or the least squares method, which leads to the deep Ritz model and deep Galerkin model, respectively. But loss landscapes from these different models give arise to different practical performance of training the NN parameters. To investigate and understand such practical differences, we propose to compare the loss landscapes of these models, which are both high dimensional and highly non-convex. In such settings, the roughness is more important than the traditional eigenvalue analysis to describe the non-convexity. We contribute to the landscape comparisons by proposing a roughness index to scientifically and quantitatively describe the heuristic concept of "roughness" of landscape around minimizers. This index is based on random projections and the variance of (normalized) total variation for one dimensional projected functions, and it is efficient to compute. A large roughness index hints an oscillatory landscape profile as a severe challenge for the first order optimization method. We apply this index to the two models for the Poisson equation and our empirical results reveal a consistent general observation that the landscapes from the deep Galerkin method around its local minimizers are less rough than the deep Ritz method, which supports the observed gain in accuracy of the deep Galerkin method.


Local Randomized Neural Networks with Discontinuous Galerkin Methods for Partial Differential Equations

Randomized neural networks (RNN) are a variation of neural networks in w...

MIM: A deep mixed residual method for solving high-order partial differential equations

In recent years, a significant amount of attention has been paid to solv...

Applications of the Deep Galerkin Method to Solving Partial Integro-Differential and Hamilton-Jacobi-Bellman Equations

We extend the Deep Galerkin Method (DGM) introduced in Sirignano and Spi...

Deep adaptive basis Galerkin method for high-dimensional evolution equations with oscillatory solutions

In this paper, we study deep neural networks (DNNs) for solving high-dim...

Towards fast weak adversarial training to solve high dimensional parabolic partial differential equations using XNODE-WAN

Due to the curse of dimensionality, solving high dimensional parabolic p...

Implicit bias with Ritz-Galerkin method in understanding deep learning for solving PDEs

This paper aims at studying the difference between Ritz-Galerkin (R-G) m...

Compressive Isogeometric Analysis

This work is motivated by the difficulty in assembling the Galerkin matr...

Please sign up or login with your details

Forgot password? Click here to reset