A Numerical Analysis Perspective on Deep Neural Networks

Abstract

In this talk, I illustrate the use of numerical analysis tools for improving the effectiveness of deep learning algorithms. With a focus on deep neural networks that can be modeled as differential equations, I highlight the importance of choosing an adequate time integrator. I also compare, using a numerical example, the difference of the first-discretize-then-differentiate and the first-differentiate-then-discretize paradigms for training residual neural networks. Finally, I show that even simple (i.e., not deep) architectures can give rise to ill-conditioned learning problems.

Date
Sep 9, 2019 12:00 PM
Lars Ruthotto
Lars Ruthotto
Winship Distinguished Research Associate Professor of Mathematics and Computer Science