# Probability Seminar

# Fall 2022

**Thursdays at 2:30 PM either in 901 Van Vleck Hall or on Zoom**

We usually end for questions at 3:20 PM.

ZOOM LINK. Valid only for online seminars.

If you would like to sign up for the email list to receive seminar announcements then please join our group.

## September 22, 2022, in person: Pierre Yves Gaudreau Lamarre (University of Chicago)

**Moments of the Parabolic Anderson Model with Asymptotically Singular Noise**

The Parabolic Anderson Model (PAM) is a stochastic partial differential equation that describes the time-evolution of particle system with the following dynamics: Each particle in the system undergoes a diffusion in space, and as they are moving through space, the particles can either multiply or get killed at a rate that depends on a random environment.

One of the fundamental problems in the theory of the PAM is to understand its behavior at large times. More specifically, the solution of the PAM at large times tends to be intermittent, meaning that most of the particles concentrate in small regions where the environment is most favorable for particle multiplication.

In this talk, we discuss a new technique to study intermittency in the PAM with a singular random environment. In short, the technique consists of approximating the singular PAM with a regularized version that becomes increasingly singular as time goes to infinity.

This talk is based on a joint work with Promit Ghosal and Yuchen Liao.

## September 29, 2022, in person: Christian Gorski (Northwestern University)

**Strict monotonicity for first passage percolation on graphs of polynomial growth and quasi-trees**

I'll present strict monotonicity results for first passage percolation (FPP) on bounded degree graphs which either have strict polynomial growth (uniform upper and lower volume growth bounds of the same polynomial degree) or are quasi-isometric to a tree; the case of the standard Cayley graph of Z^d is due to van den Berg and Kesten (1993). Roughly speaking, if we use two different weight distributions to perform FPP on a fixed graph, and one of the distributions is "larger" than the other and "subcritical" in some appropriate sense, then the expected passage times with respect to that distribution exceed those of the other distribution by an amount proportional to the graph distance. If "larger" here refers to stochastic domination of measures, this result is closely related to "absolute continuity with respect to the expected empirical measure," that is, the fact that long geodesics "use all possible weights". If "larger" here refers to variability (another ordering on measures), then a strict monotonicity theorem holds if and only if the graph also satisfies a condition we call "admitting detours". I intend to sketch the proof of absolute continuity, and, if time allows, give some indication of the difficulties that arise when proving strict monotonicity with respect to variability.

## October 6, 2022, in person: Daniel Slonim (University of Virginia)

**Random Walks in (Dirichlet) Random Environments with Jumps on Z**

We introduce the model of random walks in random environments (RWRE), which are random Markov chains on the integer lattice. These random walks are well understood in the nearest-neighbor, one-dimensional case due to reversibility of almost every Markov chain. For example, directional transience and limiting speed can be characterized in terms of simple expectations involving the transition probabilities at a single site. The reversibility is lost, however, if we go up to higher dimensions or relax the nearest-neighbor assumption by allowing jumps, and therefore much less is known in these models. Despite this non-reversibility, certain special cases have proven to be more tractable. Random Walks in Dirichlet environments (RWDE), where the transition probability vectors are drawn according to a Dirichlet distribution, have been fruitfully studied in the nearest-neighbor, higher dimensional setting. We look at RWDE in one dimension with jumps and characterize when the walk is ballistic: that is, when it has non-zero limiting velocity. It turns out that in this model, there are two factors which can cause a directionally transient walk to have zero limiting speed: finite trapping and large-scale backtracking. Finite trapping involves finite subsets of the graph where the walk is liable to get trapped for a long time. It is a highly local phenomenon that depends heavily on the structure of the underlying graph. Large-scale backtracking is a more global and one-dimensional phenomenon. The two operate "independently" in the sense that either can occur with or without the other. Moreover, if neither factor on its own is enough to cause zero speed, then the walk is ballistic, so the two factors cannot conspire together to slow a walk down to zero speed if neither is sufficient to do so on its own. This appearance of two independent factors affecting ballisticity is a new feature not seen in any previously studied RWRE models.

## October 13, 2022, ZOOM: Dasha Loukianova (Université d'Évry Val d'Essonne)

## October 20, 2022, **4pm, VV911**, in person: Simon Tavaré (Columbia University)

*Note the unusual time and room!*

**An introduction to counts-of-counts data**

Counts-of-counts data arise in many areas of biology and medicine, and have been studied by statisticians since the 1940s. One of the first examples, discussed by R. A. Fisher and collaborators in 1943 [1], concerns estimation of the number of unobserved species based on summary counts of the number of species observed once, twice, … in a sample of specimens. The data are summarized by the numbers *C _{1}, C_{2}, …* of species represented once, twice, … in a sample of size

*N = C _{1} + 2 C_{2} + 3 C_{3} + ^{….}* containing

*S = C*species; the vector

_{1}+ C_{2}+^{…}*C =*

*(C*gives the counts-of-counts. Other examples include the frequencies of the distinct alleles in a human genetics sample, the counts of distinct variants of the SARS-CoV-2 S protein obtained from consensus sequencing experiments, counts of sizes of components in certain combinatorial structures [2], and counts of the numbers of SNVs arising in one cell, two cells, … in a cancer sequencing experiment.

_{1}, C_{2}, …)In this talk I will outline some of the stochastic models used to model the distribution of *C,* and some of the inferential issues that come from estimating the parameters of these models. I will touch on the celebrated Ewens Sampling Formula [3] and Fisher’s multiple sampling problem concerning the variance expected between values of *S* in samples taken from the same population [3]. Variants of birth-death-immigration processes can be used, for example when different variants grow at different rates. Some of these models are mechanistic in spirit, others more statistical. For example, a non-mechanistic model is useful for describing the arrival of covid sequences at a database. Sequences arrive one at a time, and are either a new variant, or a copy of a variant that has appeared before. The classical Yule process with immigration provides a starting point to model this process, as I will illustrate.

*References*

[1] Fisher RA, Corbet AS & Williams CB. J Animal Ecology, 12, 1943

[2] Arratia R, Barbour AD & Tavaré S. *Logarithmic Combinatorial Structures,* EMS, 2002

[3] Ewens WJ. Theoret Popul Biol, 3, 1972

[4] Da Silva P, Jamshidpey A, McCullagh P & Tavaré S. Bernoulli Journal, in press, 2022 (online)