Trust Region Method for Coupled Systems of PDE Solvers and Deep Neural Networks

by   Kailai Xu, et al.

Physics-informed machine learning and inverse modeling require the solution of ill-conditioned non-convex optimization problems. First-order methods, such as SGD and ADAM, and quasi-Newton methods, such as BFGS and L-BFGS, have been applied with some success to optimization problems involving deep neural networks in computational engineering inverse problems. However, empirical evidence shows that convergence and accuracy for these methods remain a challenge. Our study unveiled at least two intrinsic defects of these methods when applied to coupled systems of partial differential equations (PDEs) and deep neural networks (DNNs): (1) convergence is often slow with long plateaus that make it difficult to determine whether the method has converged or not; (2) quasi-Newton methods do not provide a sufficiently accurate approximation of the Hessian matrix; this typically leads to early termination (one of the stopping criteria of the optimizer is satisfied although the achieved error is far from minimal). Based on these observations, we propose to use trust region methods for optimizing coupled systems of PDEs and DNNs. Specifically, we developed an algorithm for second-order physics constrained learning, an efficient technique to calculate Hessian matrices based on computational graphs. We show that trust region methods overcome many of the defects and exhibit remarkable fast convergence and superior accuracy compared to ADAM, BFGS, and L-BFGS.


Quasi-Newton Optimization Methods For Deep Learning Applications

Deep learning algorithms often require solving a highly non-linear and n...

FD-Net with Auxiliary Time Steps: Fast Prediction of PDEs using Hessian-Free Trust-Region Methods

Discovering the underlying physical behavior of complex systems is a cru...

Quasi-Newton Optimization in Deep Q-Learning for Playing ATARI Games

Reinforcement Learning (RL) algorithms allow artificial agents to improv...

ADCME: Learning Spatially-varying Physical Fields using Deep Neural Networks

ADCME is a novel computational framework to solve inverse problems invol...

Physics Constrained Learning for Data-driven Inverse Modeling from Sparse Observations

Deep neural networks (DNN) have been used to model nonlinear relations b...

A polarization tensor approximation for the Hessian in iterative solvers for non-linear inverse problems

For many inverse parameter problems for partial differential equations i...

Please sign up or login with your details

Forgot password? Click here to reset