A meanfield limit for certain deep neural networks
Abstract
Understanding deep neural networks (DNNs) is a key challenge in the theory of machine learning, with potential applications to the many fields where DNNs have been successfully used. This article presents a scaling limit for a DNN being trained by stochastic gradient descent. Our networks have a fixed (but arbitrary) number $L\geq 2$ of inner layers; $N\gg 1$ neurons per layer; full connections between layers; and fixed weights (or "random features" that are not trained) near the input and output. Our results describe the evolution of the DNN during training in the limit when $N\to +\infty$, which we relate to a mean field model of McKeanVlasov type. Specifically, we show that network weights are approximated by certain "ideal particles" whose distribution and dependencies are described by the meanfield model. A key part of the proof is to show existence and uniqueness for our McKeanVlasov problem, which does not seem to be amenable to existing theory. Our paper extends previous work on the $L=1$ case by Mei, Montanari and Nguyen; Rotskoff and VandenEijnden; and Sirignano and Spiliopoulos. We also complement recent independent work on $L>1$ by Sirignano and Spiliopoulos (who consider a less natural scaling limit) and Nguyen (who nonrigorously derives similar results).
 Publication:

arXiv eprints
 Pub Date:
 June 2019
 arXiv:
 arXiv:1906.00193
 Bibcode:
 2019arXiv190600193A
 Keywords:

 Mathematics  Statistics Theory;
 Condensed Matter  Disordered Systems and Neural Networks;
 Mathematics  Probability;
 60K35;
 92B20
 EPrint:
 79 pages and 2 figures