Speaker: Dr. Elizabeth Newman from Emory University
Date and Time: Friday, November 12, 2021, 1pm-2pm Hybrid (in-person at Fretwell 315 and via Zoom). Please contact Qingning Zhou to obtain the Zoom link.
Title: How to Train Better: Exploiting the Separability of Deep Neural Networks
Abstract: You would be hard-pressed to find anyone who hasn’t heard the hype about deep neural networks (DNNs). These high-dimensional function approximators, composed of simple layers parameterized by weights, have shown their success in countless applications. What the hype-sters won’t tell you is this: DNNs are challenging to train. Typically, the training problem is posed as a stochastic optimization problem with respect to the DNN weights. With millions of weights, a non-convex and non-smooth objective function, and many hyperparameters to tune, solving the training problem well is no easy task. In this talk, our goal is simple: we want to make DNN training easier. To this end, we will exploit the separability of commonly-used DNN architectures; that is, the weights of the final layer of the DNN are applied linearly. We will leverage this linearity using two different approaches. First, we will approximate the stochastic optimization problem via a sample average approximation (SAA). In this setting, we can eliminate the linear weights through partial optimization, a method affectionately known as Variable Projection (VarPro). Second, in the stochastic approximation (SA) setting, we will consider a powerful iterative sampling approach to update the linear weights, which notably incorporates automatic regularization parameter selection methods. Throughout the talk, we will demonstrate the efficacy of these two approaches to exploit separability using numerical examples.