## Szemerèdi Regularity Lemma

Welcome to the first in our series about Great Lemmas of Mathematics. We’ll start with one of the greats from combinatorics, perhaps the most useful tool in combinatorics and graph theory: the Szemerèdi Regularity Lemma (SRL). This was first proved by Szemerèdi (unsurprisingly) in his 1975 paper “On sets of integers containing no $k$ elements in arithmetic progression” as part of his proof of what is now known as Szemerèdi’s theorem – a cornerstone of additive combinatorics and one which I’m sure we’ll come back to in a future post.

This lemma turned out to be very powerful and versatile, and new and exciting results which use SRL have appeared at regular intervals ever since. Gowers has compared the SRL to playing the same role in combinatorics as the classification of finite simple groups has in finite group theory, providing a rough classification of all graphs, and offering a strategy for proving results in (finite) graph theory: first show it’s true for all graphs which obey very nice structural properties, then use the SRL to show that any graph behaves like one of these.

To state it precisely, we need to set up some notation. First, recall that a graph is just a finite set (called vertices) together with a set of pairs of vertices (called edges). We suppose that our graphs have no loops (i.e. edges connecting a vertex with itself) and that the edges aren’t directed or have any special weights attached to them.

Suppose we have two sets of vertices $U$ and $V$. Define the density $d(U,V)$ between $U$ and $V$ to be the total number of edges between them divided by the total number of edges possible between them (which is $\lvert U\rvert\lvert V\rvert$).  For any parameter $\epsilon>0$ we say that $U$ and $V$ are $\epsilon$-regular if we have $\lvert d(U',V')-d(U,V)\rvert<\epsilon$

whenever $U'$ and $V'$ are subsets of $U$ and $V$ respectively with sizes at least $\epsilon\lvert U\rvert$ and $\epsilon\lvert V\rvert$.

Szemerèdi Regularity Lemma

For any graph $\epsilon>0$ there exists a $K(\epsilon)$ such that any graph $G$ with vertex set $V$ can be broken into $k\leq K$  chunks $V_1,\hdots, V_k$ such that the following hold:

1. $\lvert V_i\rvert\leq \epsilon \lvert V\rvert$,
2. All the $V_i$ differ in size by at most one element from one another, and
3. All except $\epsilon k^2$ of the pairs $(V_i,V_j)$ are $\epsilon$-regular.

What does this really say? How this is usually used is to first consider the graph you get by letting each vertex chunk $V_i$ be a single vertex and joining two vertices if their corresponding chunks are an $\epsilon$-regular pair. You can then use regularity to deduce that if the collapsed graph contains, for example, a certain subgraph (such as a triangle) then the original graph must as well. Edges inside the vertex chunks, as well as between non-regular pairs form only a tiny proportion of the total number of edges so they don’t really matter.

A picture, however, is worth a thousand words, even it is made with my rudimentary paint skills.

I haven’t actually said why being a regular pair is a good thing to be, however, or what this really means. The idea is that a regular pair behaves like a bipartite graph where the edges are assigned randomly with some fixed probability: in other words, if you zoom in on any two bits of either chunk then they look like what you started with. Local behaviour resembles global behaviour with $\epsilon$-precision.

To see why this is a good thing to have, let’s look at the example hinted at earlier: finding triangles inside our graph. Suppose we have a triangle of regular pairs, a triangle in the collapsed graph mentioned above. Also suppose that our graph has quite a few edges to begin with, so that the edge density between these regular pairs is fairly high. It follows that there must be lots of B-C edges which share a vertex with some A-B edge (if our chunks are labelled A, B and C). Now consider the set of A vertices which are on some A-B edge, and the set of C vertices which are on one of the B-C edges connected with an A-B edge. Both of these sets are large, and by regularity, there must be lots of edges going between these two sets. In particular, there must be some edge which joins up a pair of edges to form a triangle.

That makes a lot less sense than I intended it to do. Perhaps it’s better digested in the form of a more formal proof, which will be coming soon, when I describe one of the most well-known uses of the SRL in the Triangle Lemma.

A final remark: it is crucial that the parameter $K$, which controls how many chunks are in this regular decomposition, depends only on $\epsilon$ and not on the size of the graph $G$. Otherwise, we could decompose the graph $G$ into chunks which each only contain a single vertex, and these pairs are all regular since there is nothing local to zoom in on. The more chunks we have, the worse our arguments become, so we don’t want too many.

Here’s the kicker: the proof of the SRL gives an upper bound for $K$ which is of tower height. In other words, a tower of exponents $2^{2^{2^{\cdots}}}$ which is of height roughly $1/\epsilon$. Even for fairly large $\epsilon$, this is massive, and so any results which use the SRL have terrible quantitative bounds, such as the ones Szemerèdi’s theorem had before Gowers found an alternative proof.

It is often possible to replace the use of the SRL with different arguments, and desirable for precisely this reason. One might hope that maybe we can improve the SRL directly, however, and get good (say, exponential) bounds instead. Gowers has shown that this is impossible, and constructed a graph that needs such a ridiculous number of chunks in any regular decomposition. Again, I hope to give this construction in a future blog post, since it really is amazing.

Enough for now. Hopefully these vague comments will make more sense when we actually start talking about such things formally.

This entry was posted in Uncategorized. Bookmark the permalink.

### 2 Responses to Szemerèdi Regularity Lemma

1. johnpappas says:

Reblogged this on John Pappas's blog and commented:
Regarding triangle removal lemma, I recommend for further reading:
1) The paper “Quasirandomness, Counting and Regularity for 3-Uniform Hypergraphs” by W. T. Gowers (it is interesting to see how this lemma can be generalized to hypergraphs) and
2) These excellent online notes by David Conlon https://www.dpmms.cam.ac.uk/~dc340/Extremal-course.html .

2. Rod Carvalho says:

Typo: Szemerèdi – > Szemerédi