Anticoncentration of Random Subset Sums
I took 18.204 (Seminar in Discrete Math) this semester; this is a class where we each choose a topic and give a few presentations on it. My topic was anticoncentration. In this post, I’ll talk about a theorem that’s in some sense the starting point for all the things I presented on. (I may post about those things later.)
Introduction
In this post, we’ll consider the following question.
In particular, we’re interested in proving upper bounds on how concentrated $\varepsilon_1a_1 + \cdots + \varepsilon_na_n$ can be at a single point $a$ — this means we want to upper-bound $\max_a \mathbb{P}[\varepsilon_1a_1 + \cdots + \varepsilon_na_n = a]$. This is why the topic is referred to as anticoncentration — we’re trying to show that a random variable is not super concentrated at any point.
This question was first considered by Littlewood and Offord (who were studying real roots of random polynomials) in a paper from 1943; they proved the following upper bound.
This was later improved by Erdős to the following bound.
As some intuition regarding how big the right-hand side is, by Stirling’s approximation $\binom{n}{\lfloor n/2\rfloor}$ is roughly $\frac{2^n}{\sqrt{n}}$ (ignoring constant factors), so this essentially removes the $\log n$ from the bound of Littlewood–Offord.
Also, a nice thing is that this theorem is tight – the following construction achieves equality.
So the bound in Theorem 2 is the best bound we could possibly hope to get, and it turns out to be true — this essentially means that $\varepsilon_1a_1 + \cdots + \varepsilon_na_n$ is the most concentrated in the case where all the $a_i$’s are equal.
Erdős proved this theorem using a very neat combinatorial argument, which I’ll explain in the rest of this post.
Sperner’s theorem
The proof is going to involve Sperner’s theorem regarding antichains of subsets of $[n]$, so we’ll first state and prove this theorem.
On the other hand, $\{2, 4\}$, $\{1, 3, 4\}$, $\{2, 3, 4\}$ doesn't form an antichain, as the third contains the first.
Sperner’s theorem considers the following question.
(By the size of an antichain $\mathcal{S}$, we mean the number of subsets in $\mathcal{S}$.)
One potential construction of an antichain is to take all subsets of a fixed size $k$.
To maximize the size of this antichain, we should take $k = \lfloor n/2\rfloor$. And Sperner’s theorem says essentially that this is the best we can do.
There are several proofs of this theorem; here’s a very cute one that I saw in 18.226 (involving a clever application of probabilistic methods).
On one hand, no matter what order we choose elements in, at most one set in $\mathcal{S}$ can appear in this sequence — if two sets $A$ and $B$ appeared in the sequence, then we'd have $A \subseteq B$ or $B \subseteq A$, contradicting the fact that $\mathcal{S}$ is supposed to be an antichain.
On the other hand, what's the expected number of sets $A \in \mathcal{S}$ to appear in this sequence? Each $I_k$ is a uniform random subset of $[n]$ with size $k$ (by symmetry), so if $\lvert A\rvert = k$, then the probability that $A$ appears in our sequence is \[\mathbb{P}[\text{$A$ appears in the sequence}] = \mathbb{P}[I_k = A] = \frac{1}{\binom{n}{k}}\] (since there's $\binom{n}{k}$ equally likely possibilities for $I_k$, one of which is $A$). So by linearity of expectation, the expected number of sets $A \in \mathcal{S}$ to appear in the sequence is \[\mathbb{E} \#\{A \in \mathcal{S} \mid \text{$A$ appears in the sequence}\} = \sum_{A \in \mathcal{S}} \frac{1}{\binom{n}{\lvert A\rvert}} \leq \frac{\lvert \mathcal{S}\rvert}{\binom{n}{\lfloor n/2\rfloor}}.\] And since the left-hand side is at most $1$ (the quantity we're taking the expectation of is always at most $1$), this means $\lvert \mathcal{S}\rvert \leq \binom{n}{\lfloor n/2\rfloor}$, as desired.
Proof of the main theorem
Now we’re ready to prove Theorem 2.
First, we can assume that the $a_i$’s are all positive, as flipping the sign of some $a_i$ doesn’t affect the value of $\max_a \mathbb{P}[\varepsilon_1a_1 + \cdots + \varepsilon_na_n = a]$. One way to see this is that we could imagine choosing all the $\varepsilon_i$’s from \(\{-1, 1\}\) instead of \(\{0, 1\}\) — this wouldn’t affect $\max_a \mathbb{P}[\varepsilon_1a_1 + \cdots + \varepsilon_na_n = a]$, since it’d correspond to performing the transformation $a \mapsto 2a - \frac{1}{2}(a_1 + \cdots + a_n)$ to our random variable $\varepsilon_1a_1 + \cdots + \varepsilon_na_n$. And if we’re choosing \(\varepsilon_i \in \{-1, 1\}\), then clearly flipping the sign of $a_i$ has no effect on the distribution of this random variable.
Now fix $a$, and define a collection $\mathcal{S}$ consisting of all the subsets of $[n]$ that correspond to subset sums of exactly $a$ — i.e., for each $(\varepsilon_1, \ldots, \varepsilon_n)$ satisfying $\varepsilon_1a_1 + \cdots + \varepsilon_na_n = a$, we place the corresponding set \(A = \{i \in [n] \mid \varepsilon_i = 1\}\) into $\mathcal{S}$.
The key observation (and where Sperner’s theorem comes in) is the following.
But these sums are both supposed to be $a$ (by the definition of $\mathcal{S}$), so this is a contradiction.
So then Sperner’s theorem implies that $\lvert \mathcal{S}\rvert \leq \binom{n}{\lfloor n/2\rfloor}$, which proves Theorem 2 (as there’s $2^n$ possible outcomes for $(\varepsilon_1, \ldots, \varepsilon_n)$, and the ‘good’ ones precisely correspond to elements of $\mathcal{S}$).