Probability Seminar: Difference between revisions

From UW-Math Wiki
Jump to navigation Jump to search
Line 24: Line 24:
Abstract: Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are exactly solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with large values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits.
Abstract: Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are exactly solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with large values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits.


== September 23, 2020, [https://people.ucd.ie/neil.oconnell Neil O'Connell] (Dublin)  ==
== September 24, 2020, [https://people.ucd.ie/neil.oconnell Neil O'Connell] (Dublin)  ==
 


== October 1, 2020, [https://marcusmichelen.org/ Marcus Michelen], [https://mscs.uic.edu/ UIC] ==
== October 1, 2020, [https://marcusmichelen.org/ Marcus Michelen], [https://mscs.uic.edu/ UIC] ==

Revision as of 16:32, 12 September 2020


Fall 2020

Thursdays in 901 Van Vleck Hall at 2:30 PM, unless otherwise noted. We usually end for questions at 3:20 PM.

IMPORTANT: In Fall 2020 the seminar is being run online.

If you would like to sign up for the email list to receive seminar announcements then please join our group.

September 17, 2020, Boris Hanin (Princeton and Texas A&M)

Pre-Talk:

Title: Neural Networks for Probabilists

Abstract: Deep neural networks are a centerpiece in modern machine learning. They are also fascinating probabilistic models, about which much remains unclear. In this pre-talk I will define neural networks, explain how they are used in practice, and give a survey of the big theoretical questions they have raised. If time permits, I will also explain how neural networks are related to a variety of classical areas in probability and mathematical physics, including random matrix theory, optimal transport, and combinatorics of hyperplane arrangements.

Talk:

Title: Effective Theory of Deep Neural Networks

Abstract: Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are exactly solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with large values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits.

September 24, 2020, Neil O'Connell (Dublin)

October 1, 2020, Marcus Michelen, UIC

Title: Roots of random polynomials near the unit circle

Abstract: It is a well-known (but perhaps surprising) fact that a polynomial with independent random coefficients has most of its roots very close to the unit circle. Using a probabilistic perspective, we understand the behavior of roots of random polynomials exceptionally close to the unit circle and prove several limit theorems; these results resolve several conjectures of Shepp and Vanderbei. We will also discuss how our techniques provide a heuristic, probabilistic explanation for why random polynomials tend to have most roots near the unit circle. Based on joint work with Julian Sahasrabudhe.

October 8, 2020, Subhabrata Sen, Harvard

Title: TBA

Abstract: TBA

November 12, 2020, Alexander Dunlap, NYU Courant Institute

Title: TBA

Abstract: TBA


Past Seminars