# Applied Algebra Seminar Fall 2023

**When**: 11am Central, Thursdays

**Where**: Van Vleck 901 Hall or Virtual: Zoom link (The passcode: Bucky )

**Email List**: subscribe to our google group or email JIR to add you.

**Contacts**: Jose Israel Rodriguez (Lead), Shamgar Gurevich

**How**: Talks are in-person talks and 40-50 minutes long.

**Inclusiveness**: We welcome participants with applied algebra interests from other institutions and departments. We have social time 5-10 minutes before the talk starts.

## Fall 2023 Schedule

- Contact jRodriguez43@wisc.edu to suggest speakers. This semester will be sporadic because of the IMSI Long program Algebraic Statistics and Our Changing world.

date | Location | speaker | title |
---|---|---|---|

Sept 14 | VV901 | Felix Rydell | Euclidean Distance Degree Polynomials Associated to Families of Rational Maps |

Dec 21 | VV901 | Jiayi Li | Geometry of Polynomial Neural Networks |

## Abstracts 2023 Fall

### Jiayi Li (UCLA)

**Geometry of Polynomial Neural Networks**

Polynomial neural networks achieved remarkable performance in practical tasks such as face detection, image generation, and forecasting trading signals. Besides its empirical success, the underlying theory remains an open field of research. We study the expressivity and the learning process of polynomial neural networks from an algebraic perspective. The functional space parameterized by the weights of a network is a semi-algebraic set. We refer to the functional space as the neuromanifold, and its Zariski closure as the neurovariety. We studied the dimension, defectiveness, and the learning degree of different network architectures, providing a theoretical understanding of the expressivity and optimization complexity of these architectures. Our results are accompanied by experiments. This is joint work with Kaie Kubjas and Maximilian Wiesmann.

### Felix Rydell (KTH)

**Euclidean Distance Degree Polynomials Associated to Families of Rational Maps**

Reconstructing 3D objects such as points and lines based on their images onto several views is a fundamental problem in computer vision. Given an element in a multiview variety, defined informally as the set of all object matches for a given set of views, we seek to find the 3D object that projects onto them. However, in practice data comes with noise and therefore doesn't strictly lie on the multiview variety. We remedy this by first solving the Euclidean distance problem, i.e. fitting the data onto the multiview variety by finding the closest point on it. The complexity of this problem is often measured via the Euclidean distance degree (EDD), which is the number of complex critical points of this optimization problem. A recent example is the computation of the EDD of the point multiview variety, which turns out to be a polynomial in the number of factors of degree 3.

In our work, we investigate the following more general version of the problem: Consider a finite dimensional family of rational maps with fixed domain and codomain. The map sending a point in the domain to the image tuple defined by a number of generic maps in this family defines a sequence of varieties, one for each number of generic maps. We aim to show that under reasonable assumptions, the EDD of such sequences of varieties are polynomials in the number of maps of degree equal to the dimension of the domain.

## Earlier Abstracts

### Aravind Baskar (Notre Dame)

**Optimal synthesis of robotic mechanisms using homotopy continuation**

Kinematic synthesis aims to find the dimensions of a mechanism after desired constraints have been posed on its motion. For exact synthesis, the number of constraints and dimensional design variables are equal. For approximate synthesis, the former exceeds the latter. The current research tackles approximate synthesis via an optimization formulation that is invariant to the number of constraint specifications. The objective function, which is polynomial in nature, is a sum of squares of the error in the kinematic equations at the task specifications. Using the first-order necessary conditions of optimality, solving the optimization problem boils down to that of finding the roots of a system of polynomial equations. Although these systems are of a larger total degree than their respective exact synthesis systems, homotopy continuation techniques based on multi-homogeneous structure and monodromy-based methods can be used to solve these systems completely. The roots of such systems comprise all the critical points including minima, saddles, and maxima which characterize the design space. Since secondary considerations further dictate whether a minimum is feasible for a practical design, some low index saddles are also found to lead to suitable designs. Investigating gradient descent paths starting from these low index saddles into the adjoining minima leads to useful families of solutions. This provides the designer with parameterized sets of design candidates enabled by the computation of all possible critical points of the objective. Two design case studies, namely, the design of a humanoid finger and the design of a robot leg for energetic hops are presented.

### Kaitlyn Phillipson (UW Madison)

**What can algebra and geometry tell us about a stimulus space?**

Neural activity can be stored as collections of binary strings, called neural codes, which can provide information about the stimulus that the brain is receiving. In 1971, O'Keefe and Dostrovksy [OD71] found hippocampal neurons, called place cells, which fire when a neuron is in a specific location in a stimulus space, called the receptive field of the corresponding place cell. When the animal leaves the receptive field, the neuron goes silent, and when the animal returns to the receptive field, the neuron begins firing again, thus making a memory map of the space. Furthermore, it was discovered that these receptive fields were roughly convex sets in the stimulus space. A natural mathematical question, then, is: can we determine whether or not a general neural code can represent a convex receptive field diagram from the combinatorial structure of the code itself? While this question is still open in general, this talk will introduce techniques from algebra and geometry that, when applied to this problem, give some results that partially answer this question.

### Arthur Bik (IAS)

**Strength of (infinite) polynomials**

Usually, when presented with a huge set of data in the form of a tensor, one seeks to decompose this tensor as a sum of a low number of pure tensors in order to find hidden structure. While this often works in practice, it is not clear why such a thing should be possible. Define the strength of a tensor T as the minimal number r such that T can be written as a sum of r tensors with a rank-1 flattening. For this notion of rank, we have the following result: Any nontrivial property of tensors that is preserved under changing coordinates in each direction implies bounded strength (independent of the dimensions of the tensor). In other words, given a huge tensor with structure we can expect it to be have low strength. This motivates us to study this rank function. In this talk, we focus on the strength of homogeneous polynomials (i.e. symmetric tensors).

Ananyan and Hochster defined the strength (before that known as Schmidt rank) of a polynomial in their paper proving Stillman's conjecture and it has appeared in various works since. I will state what is known and not known about it. Afterwards, we go to the infinite setting, where I will discuss how strength and other notions derived from it can be used to understand when infinite polynomial series degenerate to each other.

### Colby Long

**Algebraic Invariants in Phylogenetics.**

Phylogenetics is the study of the evolutionary relationships among organisms. A central challenge in the field is to use biological data for a given set of organisms in order to determine the evolutionary tree that describes their history. One approach for doing so is to build tree-based models of DNA sequence evolution and then to select the model that best fits the data according to some criteria. Once a model is selected, the tree parameter of the model is assumed to describe the history of the taxa under consideration. However, in order for such a method to be reliable, the tree parameter of the model must be identifiable. In this talk, we will discuss how *phylogenetic invariants*, algebraic relationships among DNA site-pattern frequencies, can be used to establish model identifiability. We’ll also discuss some invariants-based methods for phylogenetic inference.

### Zvi Rosen

**Neural Codes and Oriented Matroids**

In the 1970's, neuroscientists discovered that the hippocampus of a rat contains a virtual map of its environment. When the rat is in region 1, neuron 1 fires an electric impulse, when it moves to region 2, neuron 2 fires, and in the intersection, both neurons fire. In the decades since, algebraists have begun to model this situation abstractly as follows: Fix *n* convex subsets of Euclidean space, representing stimuli; then, record all possible subsets of *[n]* whose intersection is nonempty and not covered by other sets. This combinatorial object is called a "convex neural code."

In this talk, we relate the emerging theory of convex neural codes to the established theory of oriented matroids, which generalize systems of signed linear relations. By way of this connection, we prove that all convex codes are related to some representable oriented matroid, and we show that deciding whether a neural code is convex is NP-hard.

### Jiaxin Hu

**Beyond matrices: Statistical learning with tensor data**

Tensors efficiently represent the multiway data and serve as the foundation of higher-order data analysis in various fields such as social networks, neuroscience, and genetics studies. Identifying the low-dimensional representation from noisy tensor observations is the primary goal in many statistical learning problems including interpretable tensor decomposition, tensor regression, and multiway clustering. Though people may unfold tensors to matrices and apply the matrix methods, the unfolding strategy ignores the higher-order structure and leads to sub-optimal results in several tensor problems. The gap between matrix and tensor analyses motivates the development of tensor tools.

In this talk, I will focus on the statistical methods with higher-order tensors. First, I will introduce the basic tensor concepts and overview the low-dimensional tensor methods with various applications. Second, I will present my works on generalized tensor decomposition with multiple features and degree-corrected tensor block model for multiway clustering. We will see the difference between the matrix and tensor analyses from both modelling and theoretical aspects.

### Hanbaek Lyu

**Phase transition in contingency tables with two linear margins**

Contingency tables are matrices of nonnegative integer entries with prescribed row sums and columns sums (margins). They are fundamental objects in statistics for studying dependence structure between two or more variables, and they also correspond to bipartite multi-graphs with given degrees and play an important role in combinatorics and graph theory. Both counting the number of contingency tables and sampling one uniformly at random are fundamental problems concerning contingency tables with many connections and applications to other fields (e.g., testing hypothesis on co-occurrence of species in ecology).

A historic guiding principle to these problems is the *independence heuristic*, which was introduced by I. J. Good as far back as in 1950. The heuristic states that the constraints for the rows and columns of the table are asymptotically independent as the size of the table grows to infinity. This yields a simple yet surprisingly accurate formula for the count of contingency tables as well as a sampling heuristic using the hypergeometric (or Fisher-Yates) distribution. In this talk, we will discuss that the independence heuristic captures only one part of the picture. Specifically, for n by n contingency tables with two linear margins $BCn$ and $Cn$, we establish that contingency tables exhibit a sharp phase transition at $B = B_{c} = 1+\sqrt{1+1/C}. This yields that the independence heuristic may capture the structure of contingency tables well for near-homogeneous margins (when $B<B_{c}$), but in if the margins are not quite homogeneous ($B>B_{c}$) positive correlation between rows and columns may emerge and the heuristic fails dramatically.

### Kathryn Heal

** A Lighting-Invariant Point Processor for Shape from Shading**

Under the conventional diffuse shading model with unknown directional lighting, the set of quadratic surface shapes that are consistent with the spatial derivatives of intensity at a single image point is a two-dimensional algebraic variety embedded in the five-dimensional space of quadratic shapes. In this talk will describe the geometry of this variety, and introduce a concise feedforward model that computes an explicit, differentiable approximation of the variety from the intensity and its derivatives at any single image point. The result is a parallelizable processor that operates at each image point and produces a lighting-invariant descriptor of the continuous set of compatible surface shapes at the point. I will describe two applications of this processor: two-shot uncalibrated photometric stereo and quadratic-surface shape from shading. This talk will touch on both theory and application.

### Abeer Al Ahmadieh

**Determinantal Representations and the Image of the Principal Minor Map**

The principal minor map takes an n by n square matrix to the length $2^n$-vector of its principal minors. A basic question is to give necessary and sufficient conditions that characterize the image of various spaces of matrices under this map. In this talk I will describe the image of the space of complex matrices using a characterization of determinantal representations of multiaffine polynomials, based on the factorization of their Rayleigh differences. Using these techniques I will give equations and inequalities characterizing the images of the spaces of real and complex symmetric and Hermitian matrices. For complex symmetric matrices this recovers a result of Oeding from 2011. This is based on a joint work with Cynthia Vinzant.

### Daniel Pimentel Alarcon

**Intersecting Schubert Varieties for Subspace Clustering**

Subspace clustering aims to assign a collection of data points to the smallest possible union of subspaces. This is particularly challenging when data is missing. In this talk I will describe a new approach to overcome missing data. The main idea is to search for subspaces in the Grassmannian that simultaneously fit multiple incomplete-data points. This boils down to identifying the intersection of the Schubert varieties defined by the observed data, which is easier said than done. I will tell you what we have achieved so far, and the remaining challenges.

### Chung Pang Mok

**Pseudorandom Vectors Generation Using Elliptic Curves And Applications to Wiener Processes**

Using the arithmetic of elliptic curves over finite fields, we present an algorithm for the efficient generation of sequence of uniform pseudorandom vectors in high dimension with long period, that simulates sample sequence of a sequence of independent identically distributed random variables, with values in the hypercube $[0,1]^d$ with uniform distribution. As an application, we obtain, in the discrete time simulation, an efficient algorithm to simulate, uniformly distributed sample path sequence of a sequence of independent standard Wiener processes.

### Ola Sobieska

**Identifiability of Linear Compartmental Models: The Impact of Removing Leaks and Edges**

In biology, pharmacokinetics, and various other fields, linear compartmental models are used to represent the transfer of substances between compartments in a system, e.g. the interchange of a drug between organs in the body. Such a model is identifiable if we can recover the flow parameters from generic input and output data. In this talk, I analyze changes to identifiability resulting from deleting a pathway between two compartments or from adding or deleting an internal leak. In particular, I will present some criteria for identifiability dependant on the combinatorics of the parent model, as well as a restatement and partial answer to a conjecture of Gross, Harrington, Meshkat, and Shu. This is joint work with Patrick Chan, Katherine Johnston, Anne Shiu, and Clare Spinner.

### Diego Cifuentes

**Polynomial time guarantees for the Burer-Monteiro method**

The Burer-Monteiro method is one of the most widely used techniques for solving large-scale semidefinite programs (SDP). The basic idea is to solve a nonconvex program in Y, where Y is an n×p matrix such that X = YYT. In this paper, we show that this method can solve SDPs in polynomial time in a smoothed analysis setting. More precisely, we consider an SDP whose domain satisfies some compactness and smoothness assumptions, and slightly perturb the cost matrix and the constraints. We show that if p ≳ √(2+2η)m, where m is the number of constraints and η>0 is any fixed constant, then the Burer-Monteiro method can solve SDPs to any desired accuracy in polynomial time, in the setting of smooth analysis. Our bound on p approaches the celebrated Barvinok-Pataki bound in the limit as η goes to zero, beneath which it is known that the nonconvex program can be suboptimal.

### Papri Dey

**Hyperbolic Polynomials, and Helton-Vinnikov curves**

In this talk, I shall give a panoramic view of my research work. I shall introduce the notion of hyperbolic polynomials and discuss an algebraic method to test hyperbolicity of a multivariate polynomial w.r.t some fixed point via sum-of-squares relaxation, proposed in my research work. An important class of hyperbolic polynomials are definite determinantal polynomials. Helton-Vinnikov curves in the projective plane are cut-out by hyperbolic polynomials in three variables. This leads to the computational problem of explicitly producing a symmetric positive definite linear determinantal representation for a given curve. I shall focus on two approaches to this problem proposed in my research work: an algebraic approach via solving polynomial equations, and a geometric-combinatorial approach via scalar product representations of the coefficients and its connection with polytope membership problem. The algorithms to solve determinantal representation problems are implemented in Macaulay2 as a software package DeterminantalRepresentations. m2.

### Angélica Torres

**The line multi-view variety in computer vision**

Given an arrangement of cameras, the line multi-view variety is the Zariski closure of the image of world lines in the cameras. This is a similar definition to the multi-view variety studied extensively in computer vision. The goal of this talk is to present a description of the ideal of the line multi-view variety, and discuss some geometric properties of this variety. This is joint work with Felix Rydell, Elima Shehu, Paul Breiding, and Kathlén Kohn.

### Kevin Shu

**Nonnegative Quadratics over Stanley Reisner Varieties**

Nonnegative polynomials are of fundamental interest in the field of real algebraic geometry. We will discuss the convex geometry of nonnegative polynomials over an interesting class of algebraic varieties known as Stanley-Reisner varieties. Stanley-Reisner varieties are fairly simple, as they are unions of coordinate planes, but they have surprising connections to algebraic topology that we can use to obtain a detailed understanding of the convex geometry of these nonnegative quadratics. We will be particularly interested in the extreme nonnegative quadratic forms over such varieties. Though this will not be emphasized in the talk, this subject also has some direct applications to optimization theory.

### Jesús A. De Loera

**A Combinatorial Geometry View of Inference and Learning Problems**

In statistical inference we wish to learn the properties or parameters of a distribution through sufficiently many samples. A famous example is logistic regression, a popular non-linear model in multivariate statistics and supervised learning. Users often rely on optimizing of maximum likelihood estimation, but how much training data do we need, as a function of the dimension of the covariates of the data, before we expect an MLE to exist with high probability? Similarly, for unsupervised learning and non-parametric statistics, one wishes to uncover the shape and patterns from samples of a measure or measures. We use only the intrinsic geometry and topology of the sample. A famous example of this type of method is the $k$-means clustering algorithm. A fascinating challenge is to explain the variability of behavior of $k$-means algorithms with distinct random initializations and the shapes of the clusters.

In this talk we present new stochastic combinatorial theorems, that give bounds on the probability of existence of maximum likelihood estimators in multinomial logistic regression and also relate to the variability of clustering initializations. Along the way we will see fascinating connections to the coupon collector problem, topological data analysis, and to the computation of Tukey centerpoints of data clouds (a high-dimensional generalization of median). This is joint work with T. Hogan, R. D. Oliveros, E. Jaramillo-Rodriguez, and A. Torres-Hernandez.

### Timo de Wolff

**Certificates of Nonnegativity and Maximal Mediated Sets**

In science and engineering, we regularly face polynomial optimization problems, that is: minimize a real, multivariate polynomial under polynomial constraints. Solving these problems is essentially equivalent to certifying of nonnegativity of real polynomials -- a key problem in real algebraic geometry since the 19th century. Since this problem is notoriously hard to solve, one is interested in certificates that imply nonnegativity and are easier to check than nonnegativity itself. In particular, a polynomial is nonnegative if it is a sums of squares (SOS) of other polynomials. In 2014, Iliman and I introduced a new certificate of nonnegativity based on sums of nonnegative circuit polynomials (SONC), which I have developed further since then both in theory and practice joint with different coauthors.

A canonical question is: When is a nonnegative circuit polynomial also a sums of squares. It turns out that this question is purely combinatorial and is governed by sets of particular (integral) lattice points called maximal mediated sets, which where first introduced by Reznick. In this talk, I will give a brief introduction to nonnegativity, SOS, and SONC, and then discuss recent results on maximal mediated sets.

### Jeroen Zuiddam

**Geometric rank of tensors and applications**

There are several important problems in computational complexity and extremal combinatorics that can naturally be phrased in terms of tensors. This includes the matrix multiplication problem, the cap set problem and the sunflower problem. We present a new geometric method in this area, called Geometric Rank (Kopparty–Moshkovitz–Zuiddam, 2020). Geometric Rank is a natural extension of matrix rank to tensors. As an application, we use Geometric Rank to solve a problem of Strassen about matrix multiplication tensors.

### Anna Seigal

**Invariant Theory for Maximum Likelihood Estimation**

Maximum likelihood estimation is an optimization problem over a statistical model, to obtain the parameters that best fit observed data. I will focus on two settings: log-linear models and Gaussian group models. I will describe connections between maximum likelihood estimation and notions of stability from invariant theory. This talk is based on joint work with Carlos Améndola, Kathlén Kohn and Philipp Reichenbach.

### Hamza Fawzi

**Semidefinite representations, and the set of separable states**

The convex set of separable states plays a fundamental role in quantum information theory as it corresponds to the set of non-entangled states. In this talk I will discuss the question of semidefinite programming representations for this convex set. Using connections with nonnegative polynomials and sums of squares I will show that except in the low-dimensions, this semialgebraic set has no semidefinite representation. As a consequence this set furnishes a counter-example to the Helton-Nie conjecture. The proof I will present is elementary and relies only on basic results about semialgebraic sets and functions.

### James Saunderson

**Lifting For Simplicity: Concise Descriptions of Convex Sets**

This talk will give a selective overview of the theory and applications of lifts of convex sets. A lift of a convex set is a higher-dimensional convex set that projects onto the original set. Many interesting convex sets have lifts that are dramatically simpler to describe than the original set. Finding such simple lifts has significant algorithmic implications, particularly for associated optimization problems. We will consider both the classical case of polyhedral lifts, which are described by linear inequalities, as well as spectrahedral lifts, which are defined by linear matrix inequalities. The talk will include discussion of ways to construct lifts, ways to find obstructions to the existence of lifts, and a number of interesting examples from a variety of mathematical contexts. (Based on joint work with H. Fawzi, J. Gouveia, P. Parrilo, and R. Thomas).

### Greg Blekherman

**Locally Positive Semidefinite Matrices**

The cone of positive semidefinite matrices plays a prominent role in optimization, and many hard computational problems have well-performing semidefinite relaxations. In practice, enforcing the constraint that a large matrix is positive semidefinite can be expensive. We introduce the cone of k-locally posiitive semidefinite matrices, which consists of matrices all of whose k by k principal submatrices are positive semidefinite. We consider the distance between the cones of positive and locally positive semidefinite matrices, and possible eigenvalues of locally positive semidefinite matrices. Hyperbolic polynomials play a role in some of the proofs. Joint work with Santanu Dey, Marco Molinaro, Kevin Shu and Shengding Sun.

### Carla Michini

**Short simplex paths in lattice polytopes**

We consider the problem of optimizing a linear function over a lattice polytope P contained in [0,k]^n and defined via m linear inequalities. We design a simplex algorithm that, given an initial vertex, reaches an optimal vertex by tracing a path along the edges of P of length at most O(n^6 k log k). The length of this path is independent on m and is the best possible up to a polynomial function, since it is only polynomially far from the worst case diameter. The number of arithmetic operations needed to compute the next vertex in the path is polynomial in n, m and log k. If k is polynomially bounded by n and m, the algorithm runs in strongly polynomial time. This is a joint work with Alberto Del Pia.

### Alisha Zachariah

**Efficiently Estimating a Sparse Delay-Doppler Channel**

Multiple wireless sensing tasks, e.g., radar detection for driver safety, involve estimating the ”channel” or relationship between signal transmitted and received. In this talk, I will focus on a certain type of channel known as the delay-doppler channel. This channel model starts to be applicable in high frequency carrier settings, which are increasingly common with recent developments in mmWave technology. Moreover, in this setting, both the channel model and existing technologies are amenable to working with signals of large bandwidth, and using such signals is a standard approach to achieving high resolution channel estimation. However, when high resolution is desirable, this approach creates a tension with the desire for efficiency because, in particular, it immediately implies that the signals in play live in a space of very high dimension N (e.g., ~10^6 in some applications), as per the Shannon-Nyquist sampling theorem.

To address this, I will propose a randomized algorithm for channel estimation in the k-sparse setting (e.g., k objects in radar detection), with sampling and space complexity both on the order of k(log N)^2, and arithmetic complexity on the order of k(log N)^3+k^2, for N sufficiently large.

While this algorithm seems to be extremely efficient -- to the best of our knowledge, the first of this nature in terms of complexity -- it is just a simple combination of three ingredients, two of which are well-known and widely used, namely digital chirp signals and discrete Gaussian filter functions, and the third being recent developments in Sparse Fast Fourier Transform algorithms.

### Ruhui Jin (UW Madison)

**Scalable symmetric Tucker tensor decomposition**

In this talk, we will study the best low-rank Tucker decomposition of symmetric tensors. The main motivation is in decomposing higher-order multivariate moments, which has crucial applications in data science.

We will show the scalable adaptations of the projected gradient descent (PGD) and the higher-order eigenvalue decomposition (HOEVD) methods to decompose sample moment tensors. With the help of implicit and streaming techniques, we can evade the overhead cost of building and storing the moment tensor. Such reductions make computing the Tucker decomposition realizable for large data instances in high dimensions.

We will also demonstrate the efficiency of the algorithms and the applicability to real-world datasets. For convergence guarantee, the update sequence derived by the PGD solver achieves first and second-order criticality.

### Max Hill (UW Madison)

**An algebraic approach to understanding long branch attraction in phylogenetics**

Reconstructing and understanding the Tree of Life is a central question biology, and while genetic data has proven to be an essential tool in studying the evolutionary relationships between living organisms, the "best" way to use such data to evolutionary trees is a major area of study. One commonly-used approach is *maximum likelihood*: one first assumes that the gene sequences evolved according to some model of evolution parameterized by an evolutionary tree parameter (representing a hypothesis about the species' evolutionary histories). Then, given some observed data, one can estimate the tree which maximizes the likelihood of the data under the model. However, sometimes this method fails catastrophically in ways that are not yet fully understood.

In this talk, I will present ongoing work with Jose Rodriguez investigating *Long Branch Attraction* (LBA), a phenomenon in which maximum likelihood incorrectly infers distantly related taxa to be much more closely related than they really are. We present an approach using phylogenetic invariants and homotopy continuation to numerically solve for the MLE in the simplest case of a four leaf tree, and use this to establish criteria for LBA susceptibility using Likelihood Ratios.

### Fred Sala (UW Madison, CS)

**Moment Techniques for Weakly-Supervised Learning**

Manually labeling data is a major bottleneck for building large-scale training sets used for modern machine learning models. Weak supervision methods instead assemble a collection of cheap-but-noisy sources to estimate labels. These sources have unknown qualities and interactions, and therefore learning latent variable models to estimate their reliabilities is a useful goal. I will start by describing techniques that rely on algebraic relationships between moments to learn such models, focusing on identifiability, consistency, and finite-sample bounds. In the second part of the talk, I will discuss how to generalize such techniques to structured prediction cases where the label is an element of a structured space (for example, a ranking or a graph). I will introduce an approach to estimation that combines pseudo-Euclidean embeddings and tensor decomposition algorithms. Finally, I will describe some opportunities and outstanding challenges.

### Marina Garrote López (UBC)

**Using Algebraic and Semi-Algebraic conditions for Phylogenetic Reconstruction**

Phylogenetic reconstruction aims to estimate the evolutionary relationships among different biological entities using only information from their genomes. Typically, DNA sequences are assumed to evolve according to a Markov process on a phylogenetic tree governed by a model of nucleotide substitutions. The distribution of nucleotide patterns in the leaves of a phylogenetic tree is represented by a vector whose entries satisfy certain algebraic relationships and can be expressed as polynomials over the model parameters. The study of these relationships and of the geometry of the algebraic varieties defined by them has provided successful insight into the problem of phylogenetic reconstruction.

However, not all points in the algebraic varieties have biological or probabilistic sense, and it is essential to study the subset of varieties that has biological meaning. In this talk we will emphasize the importance of studying this subsets, characterize them and prove that, in some cases, it is fundamental for phylogenetic reconstruction. We will discuss the use of maximum likelihood estimation, Euclidean distance minimization over the corresponding subset of the variety, and a new phylogenetic reconstruction method called ASAQ, based on some semi-algebraic conditions.

We will also present some results on the maximum likelihood degree and the Euclidean distance degree for phylogenetic models, which measure the complexity of optimization problems, and discuss their relationship and differences. Ultimately, this talk aims to provide insights into the problem of phylogenetic reconstruction and to show how algebraic and semi-algebraic methods can help in the development of more accurate and effective methods for this task.

### Serkan Hosten (SFSU)

**Solving Polynomial Systems for Optimization and Neural Networks**

This talk has two parts. In the first part, I will tell about computing the degree of the central curve in linear, quadratic, and semidefinite programming problems arising in the context of interior point algorithms. I will present a new proof for this degree in the linear programming case based on counting the number of solutions to zero dimensional polynomial systems using the BKK bound. In the case of quadratic programming, a very similar system of polynomials gives an upper bound to the degree of the corresponding central curve.

In the second part, I will tell you about training deep linear neural networks and how solving systems of polynomials arise in this context. For networks with a single hidden layer trained on a single data point I will give an improved bound on the number of complex critical points of the loss function for the network. I will also show that for any number of hidden layers complex critical points with zero coordinates arise in certain patterns which we completely classify for networks with one hidden layer. I report the results of computational experiments with varying network architectures defining small deep linear networks using HomotopyContinuation.jl.

### Elizabeth W. (Eliza) O'Reilly (Caltech)

**Spectrahedral Regression**

Convex regression is the problem of fitting a convex function to a collection of input-output pairs, and arises naturally in applications such as economics, engineering and computed tomography. We present a new approach to this problem called spectrahedral regression, in which we fit a spectrahedral function to the data, i.e. a function that is the maximum eigenvalue of an affine matrix expression of the input. This method generalizes polyhedral (also called max-affine) regression, in which a maximum of a fixed number of affine functions is fit to the data. We first provide bounds on how well spectrahedral functions can approximate arbitrary convex functions via statistical risk analysis. Second, we analyze an alternating minimization algorithm for the non-convex optimization problem of fitting a spectrahedral function to data. Finally, we demonstrate the utility of our approach with experiments on synthetic and real data sets. This talk is based on joint work with Venkat Chandrasekaran.

### Alberto Del Pia (UW Madison, I&SE)

**Minimizing quadratics over integers**

Mixed integer quadratic programming is the problem of minimizing a quadratic polynomial over points in a polyhedral region with some integer components. It is a natural extension of mixed integer linear programming and it has a wide array of applications. In this talk, I will survey some recent theoretical developments in mixed integer quadratic programming, with a focus on complexity, algorithms, and fundamental properties.

## Related events to note

date | event/title | location/speaker | info |
---|---|---|---|

Friday at 2:30pm | Algebra+Algebraic Geometry Seminar | Van Vleck | Mailing list |

Sporadically | Machine Assistance for Algebra | UW Madison | |

Biweekly Mondays | Algebraic Statistics Online Seminar (ASOS) | Virtual: Recordings | Mailing list sign-up for Zoom-links |

Monthly | Fellowship of the Ring | Online (Utah) |