# Difference between revisions of "Probability Seminar"

(→September 15, 2020, Boris Hanin (Princeton and Texas A&M)) |
(→September 15, 2020, Boris Hanin (Princeton and Texas A&M)) |
||

Line 14: | Line 14: | ||

Pre-Talk: | Pre-Talk: | ||

− | Title: Neural Networks for Probabilists | + | Title: '''Neural Networks for Probabilists''' |

Abstract: Deep neural networks are a centerpiece in modern machine learning. They are also fascinating probabilistic models, about which much remains unclear. In this pre-talk I will define neural networks, explain how they are used in practice, and give a survey of the big theoretical questions they have raised. If time permits, I will also explain how neural networks are related to a variety of classical areas in probability and mathematical physics, including random matrix theory, optimal transport, and combinatorics of hyperplane arrangements. | Abstract: Deep neural networks are a centerpiece in modern machine learning. They are also fascinating probabilistic models, about which much remains unclear. In this pre-talk I will define neural networks, explain how they are used in practice, and give a survey of the big theoretical questions they have raised. If time permits, I will also explain how neural networks are related to a variety of classical areas in probability and mathematical physics, including random matrix theory, optimal transport, and combinatorics of hyperplane arrangements. | ||

Line 20: | Line 20: | ||

Talk: | Talk: | ||

− | Title: Effective Theory of Deep Neural Networks | + | Title: '''Effective Theory of Deep Neural Networks''' |

Abstract: Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are exactly solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with large values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits. | Abstract: Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are exactly solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with large values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits. |

## Revision as of 11:08, 12 September 2020

# Fall 2020

**Thursdays in 901 Van Vleck Hall at 2:30 PM**, unless otherwise noted.
**We usually end for questions at 3:20 PM.**

** IMPORTANT: ** In Fall 2020 the seminar is being run online.

If you would like to sign up for the email list to receive seminar announcements then please join our group.

## September 15, 2020, Boris Hanin (Princeton and Texas A&M)

Pre-Talk:

Title: **Neural Networks for Probabilists**

Abstract: Deep neural networks are a centerpiece in modern machine learning. They are also fascinating probabilistic models, about which much remains unclear. In this pre-talk I will define neural networks, explain how they are used in practice, and give a survey of the big theoretical questions they have raised. If time permits, I will also explain how neural networks are related to a variety of classical areas in probability and mathematical physics, including random matrix theory, optimal transport, and combinatorics of hyperplane arrangements.

Talk:

Title: **Effective Theory of Deep Neural Networks**

Abstract: Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are exactly solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with large values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits.

## September 23, 2020, Neil O'Connell (Dublin)

## October 1, 2020, Marcus Michelen, UIC

Title: **Roots of random polynomials near the unit circle**

Abstract: It is a well-known (but perhaps surprising) fact that a polynomial with independent random coefficients has most of its roots very close to the unit circle. Using a probabilistic perspective, we understand the behavior of roots of random polynomials exceptionally close to the unit circle and prove several limit theorems; these results resolve several conjectures of Shepp and Vanderbei. We will also discuss how our techniques provide a heuristic, probabilistic explanation for why random polynomials tend to have most roots near the unit circle. Based on joint work with Julian Sahasrabudhe.

## October 8, 2020, Subhabrata Sen, Harvard

Title: **TBA**

Abstract: TBA

## November 12, 2020, Alexander Dunlap, NYU Courant Institute

Title: **TBA**

Abstract: TBA