Skip to main content

Chapter 16 An additive large sieve inequality

This chapter begins the fourth part of the course, in which we develop large sieve inequalities. While these give various interesting results, our primary interest here will be in controlling error terms in the prime number theorem on aggregate via the Bombieri–Vinogradov inequality.
In this chapter, we introduce an additive large sieve inequality, which can be conceptualized outside of the context of number theory. We will convert this into a multiplicative form, which is more relevant to the distribution of primes, in Chapter 17.

Section 16.1 Overview

The purpose of a “large sieve” is to allow sieving over a range of primes not possible with the traditional sieve methods we considered earlier. The price to be paid is that one only gets results of an aggregate nature. For instance, in the Bombieri–Vinogradov theorem, we will consider the error terms in the prime number theorem in arithmetic progression for all moduli in some range, and show that the sum of the errors cannot be too large.
This said, a “large sieve inequality” does not itself involve a sieve; the sieves only appear in the application. The general large sieve problem is, given a finite set \(V\) of vectors \(v \in \CC^n\text{,}\) to find the smallest constant \(C = C(V)\) such that for any vector \(x \in \CC^n\text{,}\)
\begin{equation} \sum_{v \in V} |v \cdot \overline{x}|^2 \leq C x \cdot \overline{x}.\tag{16.1.1} \end{equation}
(Note that \(v \cdot \overline{x}\) is the usual Hermitian inner product.) Of course one has \(C \leq \sum_{v \in V} v \cdot \overline{v}\) by Cauchy–Schwarz term by term, but this is nowhere near optimal if the vectors \(v\) are pointing in all different directions, as then the vector \(x\) cannot simultaneously be nearly parallel to all of them. A trivial example is given by an orthonormal set of vectors, in which case \(C = 1\text{;}\) see Exercise 16.4.1 for another simple example.
In number theory applications, we tend to view the same setup as follows. Given a finite set \(X\) of complex-valued sequences, and a cutoff \(N\text{,}\) find a constant \(C = C(X,N)\) such that for any \(a_n \in \CC\text{,}\)
\begin{equation*} \sum_{x \in X} \left| \sum_{n \leq N} a_n x(n) \right|^2 \leq C \sum_{n \leq N} |a_n|^2. \end{equation*}

Section 16.2 Preparatory lemmas

In preparation to state a large sieve result, we collect some elementary statements of linear algebra, whose proofs we mostly leave as exercises.
The following is a variation of a classic inequality of Hilbert.
Let \(K\) be a large positive integer. By Lemma 16.1 applied to the set of \(M + \alpha_i\) and the numbers \((-1)^M z_i\) for \(i=1,\dots,n\) and \(M=1,\dots,K\text{,}\) we get
\begin{equation*} \left| \sum_{(i,M) \neq (j,N)} (-1)^{M-N} \frac{z_i \overline{z_j}}{M-N + \alpha_i - \alpha_j}\right| \leq \frac{\pi K}{\delta} \sum_{i=1}^n |z_i|^2. \end{equation*}
It changes nothing to run the sum over pairs of pairs in which only \(i \neq j\text{,}\) since the terms \((i,M),(i,N)\) and \((i,N),(i,M)\) cancel each other. Put \(k = M-N\) and divide by \(K\) to obtain
\begin{equation*} \left| \sum_{i\neq j} z_i \overline{z_j} \sum_{k=-K}^K \left(1 - \frac{|k|}{K} \right) \frac{(-1)^k}{k + \alpha_i - \alpha_j} \right| \leq \frac{\pi}{\delta} \sum_{i=1}^n |z_i|^2. \end{equation*}
Taking \(K \to \infty\) and recalling that
\begin{equation*} \frac{1}{\alpha} + \sum_{k=1}^\infty \left( \frac{(-1)^k}{k + \alpha} + \frac{(-1)^{-k}}{-k + \alpha} \right) = \frac{\pi}{\sin \pi \alpha} \end{equation*}
yields the claim.
Apply Corollary 16.2 twice, multiplying \(z_i\) by \(\exp(\pm 2 \pi i x \alpha_i)\text{.}\)

Section 16.3 An additive large sieve

In the additive large sieve, we take the sequences \(x \in X\) to be of the form \(\exp(2\pi i \alpha n)\) for some \(\alpha = \alpha_x \in \RR\) (or better, in \(\RR/\ZZ\)). In order for these to be “not too parallel”, we insist that the corresponding \(\alpha_x\) be \(\delta\)-spaced for some \(\delta > 0\text{,}\) i.e., if \(x,y \in X\) are distinct, then \(\alpha_x - \alpha_y\) must have distance at least \(\delta\) from the nearest integer.
The following inequality is due independently to Selberg and Montgomery–Vaughan; it refines a result of Davenport–Halberstam.
We prove here only the bound with the factor \(\delta^{-1} + N - 1\) replaced by \(\delta^{-1} + N\text{;}\) there is a fun trick to pick up the extra \(-1\) (see Exercise 16.4.4).
By Lemma 16.4, we may reduce to showing that for any \(z_\alpha \in \CC\text{,}\)
\begin{equation*} \sum_{M \lt n \leq M+N} \left| \sum_{\alpha \in S} z_\alpha \exp(2 \pi i n \alpha) \right|^2 \leq (\delta^{-1} + N) \sum_{\alpha \in S} |z_\alpha|^2. \end{equation*}
When we expand the square on the left side, the diagonal terms contribute \(N \sum_\alpha |z_\alpha|^2\text{.}\) The off-diagonal terms give
\begin{equation*} \sum_{\alpha \neq \beta} z_\alpha \overline{z_\beta} \exp(2 \pi i K (\alpha - \beta)) \frac{\sin \pi N (\alpha - \beta)}{\sin \pi (\alpha - \beta)} \end{equation*}
for \(K = M + (N+1)/2\text{.}\) By Corollary 16.3, this is bounded by \(\delta^{-1} \sum_\alpha |z_\alpha|^2\text{.}\)

Exercises 16.4 Exercises

1.

Find the optimal constant in the large sieve inequality (16.1.1) when the vectors in \(V\) are taken to be unit vectors forming the corners of a regular simplex in \(\RR^n\) with center at the origin.
Hint.
It may simply matters to view the situation inside an \((n+1)\)-dimensional space.

2.

Prove Lemma 16.1.
Hint.
By Cauchy–Schwarz, it is enough to prove
\begin{equation*} \sum_{i=1}^n \left| \sum_{j \neq i} \frac{z_j}{\lambda_i - \lambda_j} \right|^2 \leq \frac{\pi^2}{\delta^2} \sum_{i=1}^n |z_i|^2. \end{equation*}
Do this by extremizing an appropriate Hermitian (quadratic) form, and noting that the extremal vector must be an eigenvector.

4. (Cohen).

Prove Theorem 16.5 as stated, assuming the version in which the factor \(\delta^{-1} + N - 1\) is replaced by \(\delta^{-1} + N\text{.}\)
Hint.
Apply the weak version to the \(\delta K\)-spaced points \((\alpha + k)/K\) for \(\alpha\) running over \(S\) and \(k\) running over \(\{1, \dots, K\}\text{,}\) and the values \(b_m\) being related to the original \(a_n\) via
\begin{equation*} \sum_m b_m \exp(2 \pi i \alpha m) = \sum_n a_n \exp(2 \pi i K \alpha n). \end{equation*}
Then take the limit as \(K \to \infty\text{.}\)