Skip to main content

Chapter 20 Prime \(k\)-tuples

This chapter begins the fifth part of the course, in which we combine material from previous parts to study gaps between primes.
In this chapter, we review the work of Gallagher [6] in the direction of the Hardy–Littlewood \(k\)-tuples conjecture, a vast (and quantitative) generalization of the twin primes conjecture.

Section 20.1 The Hardy–Littlewood \(k\)-tuples conjecture

Let \(\calH\) denote a \(k\)-tuple of distinct integers. What does one expect about the distribution of the integers \(n\) such that \(n + h\) is prime for each \(h \in \calH\text{?}\)
Here is a rather simple-minded guess. The prime number theorem suggests that if one chooses a random integer of size \(x\text{,}\) it will be prime with probability \(1/(\log x)\text{.}\) If one then chooses \(k\) distinct integers of size \(x\text{,}\) and there is no obvious reason why they cannot all be prime, then one might expect them to be simultaneously prime with probability \(\log^{-k} x\text{,}\) and the number of such tuples with terms bounded by \(x\) should be asymptotic to \(x \log^{-k} x\text{,}\) with the constant 1.
However, this turns out not to be the correct constant, as is easily verified against experimental evidence in the case of twin primes. The reason is perhaps obvious: the facts that the different \(n+h\) are coprime to a fixed prime \(p\) are not independent, and one needs to account for this. Here is the recipe for doing so proposed by Hardy–Littlewood.

Definition 20.1.

Fix a prime \(p\text{.}\) The probability that \(k\) randomly chosen integers are all not divisible by \(p\) is \((1-1/p)^k\text{.}\) On the other hand, the probability that the \(n+h\) are all coprime to \(p\) is \(1 - v_{\calH}(p)/p\text{,}\) where \(v_{\calH}(p)\) is the number of residue classes modulo \(p\) represented by elements of \(\calH\text{.}\) We thus set
\begin{equation*} \frakS(\calH) := \prod_p \left(1 - \frac{v_{\calH}(p)}{p} \right) \left( 1 - \frac{1}{p} \right)^{-k} \end{equation*}
(called the singular series, because it occurs in the Hardy–Littlewood circle method as a series summed over singularities of some integral) and make the following conjecture of Hardy–Littlewood [11]. (The infinitude of the set of primes in question had been previously conjectured by Dickson [4].)

Remark 20.3.

If \(v_{\calH}(p) = 0\) for some \(p\text{,}\) then there is a trivial obstruction created by divisibility mod \(p\text{,}\) so you only get finitely many prime \(k\)-tuples of that shape. On the other hand, if \(v_{\calH}(p) \lt p\) for all \(p\text{,}\) then the product converges absolutely and so \(\frakS(\calH) > 0\text{;}\) so there is no inconsistency here.

Remark 20.4.

It will be convenient later to adopt the convention that the definition of \(\frakS(\calH)\) remains unchanged if \(\calH\) does not have distinct entries, even though this carries no new information.

Section 20.2 \(k\)-tuples and prime gaps

If one is only interested in looking for primes which are close together, without specifying exactly what the gaps are, one could go back to the probabilistic model (attributed to Cramér, more famous for his rule for solving linear systems). It suggests that the distribution of \(\pi(n+h)-\pi(n)\text{,}\) for \(n \leq N\) and \(h \sim \lambda \log N\) with \(\lambda\) fixed, should approach a Poisson distribution with parameter \(\lambda\) as \(N \to \infty\text{.}\) (This is consistent with the observation that \(\li(x)\) is a better approximation of to \(\pi(x)\) than \(\frac{x}{\log x}\text{;}\) see Exercise 8.4.1.)
The fact that this prediction follows from a suitably uniform version of the \(k\)-tuples conjecture is due to Gallagher. The theorem asserts that the fudge factor \(\frakS(\calH)\) between the probabilistic model and the Hardy–Littlewood prediction averages out to 1, so that the prediction based on the probabilistic model is consistent with Hardy–Littlewood. (Per Remark 20.4 we allow tuples with repeated entries, but their overall contribution is \(O(x^{k-1})\) and so does not affect anything.)
We sketch of Gallagher's proof modulo a few details left as exercises. Throughout, keep \(k\) fixed.
Put
\begin{align*} a(p, m) \amp= \left(1 - \frac{m}{p} \right) \left( 1 - \frac{1}{p} \right)^{-k} - 1\\ a_{\calH}(p) \amp= a(p, v_{\calH}(p)) \end{align*}
so that
\begin{equation*} \frakS(\calH) = \prod_p (1 + a_{\calH}(p)). \end{equation*}
Extend \(a\) by multiplicativity to squarefree arguments \(d\text{,}\) so that
\begin{equation*} \frakS(\calH) = \sum_d a_{\calH}(d) \end{equation*}
with the sum on the right being absolutely convergent.
We can truncate the sum over \(d\) because for fixed \(\epsilon > 0\text{,}\) we have (by Exercise 20.3.3) that
\begin{equation} \sum_{\calH \in \{1, \dots,x\}^k} \frakS(\calH) = \sum_{d \leq y} \sum_{\calH} a_\calH(d) + O(x^k (xy)^\epsilon/y)\tag{20.2.1} \end{equation}
with the constant depending only on \(k,\epsilon\) and not on \(x,y\text{.}\)
For any given \(d\text{,}\) we can rewrite the inner sum of (20.2.1) as a sum
\begin{equation*} \sum_v \left(\prod_{p|d} a(p,v(p)) \right)f_d(x,v), \end{equation*}
where \(v\) runs over vectors indexed by the prime factors of \(d\text{,}\) with \(v(p) \in \{1, \dots, p\}\) for each \(p | d\text{,}\) and \(f_d(x,v)\) counts \(k\)-tuples \(\calH \in \{1,\dots,x\}^r\) which occupy exactly \(v(p)\) residue classes modulo \(p\) for each \(p | d\text{.}\)
Write \(\stir{a}{b}\) for the number of partitions of an \(a\)-element set into \(b\) unordered parts (Stirling number of the second kind). If we set
\begin{align*} A(d) \amp= \sum_v \prod_{p|d} a(p,v(p)) \binom{p}{v(p)} v(p)! \stir{k}{v(p)}\\ B(d) \amp= \sum_v \prod_{p|d} |a(p,v(p))| \binom{p}{v(p)} v(p)! \stir{k}{v(p)}\\ C(d) \amp= \sum_v \prod_{p|d} |a(p,v(p))|, \end{align*}
then
\begin{equation} \sum_{\calH} a_\calH(d) = (x/d)^k A(d) + O((x/d)^{k-1} B(d)) + O(x^{k-1} C(d)).\tag{20.2.2} \end{equation}
From this, plus the identities
\begin{align} \sum_{v=1}^p \binom{p}{v} v! \stir{k}{v} \amp= p^k\tag{20.2.3}\\ \sum_{v=1}^p v \binom{p}{v} v! \stir{k}{v} \amp= p^{k+1} - (p-1)^k p\tag{20.2.4} \end{align}
(see Exercise 20.3.5), it is not difficult to finish (see Exercise 20.3.6).

Exercises 20.3 Exercises

1.

Let \(p_1,p_2,\dots\) be the sequence of primes in ascending order. Prove that for any positive integer \(k\text{,}\) for \(n = \pi(k)\text{,}\) the tuple \(\calH = (p_{n+1}, \dots, p_{n+k})\) satisfies \(\frakS(\calH) > 0\text{.}\)

2.

Prove that for \(k\) a positive integer,
\begin{equation*} \int_2^x \log^{-k} t\,dt \sim x \log^{-k} x. \end{equation*}

4.

Prove (20.2.2).
Hint.
It might help to think in terms of counting lattice points.

6.

Complete the proof of Theorem 20.5.
Hint.
First use (20.2.3) and (20.2.4) to calculate \(A(d)\text{.}\) Then estimate \(B(d)\) and \(C(d)\) using the bound
\begin{equation*} |a(p,m)| \leq \begin{cases} c(k) (p-1)^{-2} \amp m=k \\ c(k) (p-1)^{-1} \amp m \lt k; \end{cases} \end{equation*}
note that the constant \(c(k)\) depends on \(k\) but not on \(p\) or \(m\text{.}\) Finally, take \(y = x^{1/2}\) in (20.2.1).

7.

Use the Poisson distribution model to compute a predicted distribution for the ratio \((p_{n+1}-p_n)/(\log p_n)\text{.}\)