MATH Seminar

Title: Structure-conforming Operator Learning via Transformers
Seminar: Numerical Analysis and Scientific Computing
Speaker: Shuhao Cao of University of Missouri-Kansas City
Contact: Yuanzhe Xi, yuanzhe.xi@emory.edu
Date: 2024-03-21 at 10:00AM
Venue: MSC W201
Download Flyer
Abstract: GPT, Stable Diffusion, AlphaFold 2, etc., all these state-of-the-art deep learning models use a neural architecture called "Transformer". Since the emergence of "Attention Is All You Need" paper by Google, Transformer is now the ubiquitous architecture in deep learning. At Transformer's heart and soul is the "attention mechanism". In this talk, we shall dissect the "attention mechanism" through the lens of traditional numerical methods, such as Galerkin methods, and hierarchical matrix decomposition. We will report some numerical results on designing attention-based neural networks according to the structure of a problem in traditional scientific computing, such as inverse problems for Neumann-to-Dirichlet operator (EIT) or multiscale elliptic problems. Progresses within different communities will be briefed to answer some open problems on the mathematical properties of the attention mechanism in Transformers.