This chapter begins the fourth part of the course, in which we develop large sieve inequalities. While these give various interesting results, our primary interest here will be in controlling error terms in the prime number theorem on aggregate via the Bombieri–Vinogradov inequality.
In this chapter, we introduce an additive large sieve inequality, which can be conceptualized outside of the context of number theory. We will convert this into a multiplicative form, which is more relevant to the distribution of primes, in Chapter 17.
Section16.1Overview
The purpose of a “large sieve” is to allow sieving over a range of primes not possible with the traditional sieve methods we considered earlier. The price to be paid is that one only gets results of an aggregate nature. For instance, in the Bombieri–Vinogradov theorem, we will consider the error terms in the prime number theorem in arithmetic progression for all moduli in some range, and show that the sum of the errors cannot be too large.
This said, a “large sieve inequality” does not itself involve a sieve; the sieves only appear in the application. The general large sieve problem is, given a finite set \(V\) of vectors \(v \in \CC^n\text{,}\) to find the smallest constant \(C = C(V)\) such that for any vector \(x \in \CC^n\text{,}\)
\begin{equation}
\sum_{v \in V} |v \cdot \overline{x}|^2 \leq C x \cdot \overline{x}.\tag{16.1.1}
\end{equation}
(Note that \(v \cdot \overline{x}\) is the usual Hermitian inner product.) Of course one has \(C \leq \sum_{v \in V} v \cdot \overline{v}\) by Cauchy–Schwarz term by term, but this is nowhere near optimal if the vectors \(v\) are pointing in all different directions, as then the vector \(x\) cannot simultaneously be nearly parallel to all of them. A trivial example is given by an orthonormal set of vectors, in which case \(C = 1\text{;}\) see Exercise 16.4.1 for another simple example.
In number theory applications, we tend to view the same setup as follows. Given a finite set \(X\) of complex-valued sequences, and a cutoff \(N\text{,}\) find a constant \(C = C(X,N)\) such that for any \(a_n \in \CC\text{,}\)
In preparation to state a large sieve result, we collect some elementary statements of linear algebra, whose proofs we mostly leave as exercises.
The following is a variation of a classic inequality of Hilbert.
Lemma16.1.
Let \(\lambda_1, \dots, \lambda_n\) be real numbers with \(|\lambda_i - \lambda_j|
\geq \delta\) whenever \(i \neq j\text{.}\) Then for any \(z_1, \dots, z_n \in \CC\text{,}\)
Let \(K\) be a large positive integer. By Lemma 16.1 applied to the set of \(M + \alpha_i\) and the numbers \((-1)^M z_i\) for \(i=1,\dots,n\) and \(M=1,\dots,K\text{,}\) we get
It changes nothing to run the sum over pairs of pairs in which only \(i \neq j\text{,}\) since the terms \((i,M),(i,N)\) and \((i,N),(i,M)\) cancel each other. Put \(k = M-N\) and divide by \(K\) to obtain
In the additive large sieve, we take the sequences \(x \in X\) to be of the form \(\exp(2\pi i \alpha n)\) for some \(\alpha = \alpha_x
\in \RR\) (or better, in \(\RR/\ZZ\)). In order for these to be “not too parallel”, we insist that the corresponding \(\alpha_x\) be \(\delta\)-spaced for some \(\delta > 0\text{,}\) i.e., if \(x,y \in X\) are distinct, then \(\alpha_x - \alpha_y\) must have distance at least \(\delta\) from the nearest integer.
The following inequality is due independently to Selberg and Montgomery–Vaughan; it refines a result of Davenport–Halberstam.
Theorem16.5.Additive large sieve inequality.
Fix \(\delta \in (0, 1/2]\text{.}\) Let \(S \subset \RR\) be a \(\delta\)-spaced set (necessarily finite). Then for any \(a_n \in \CC\) for \(M \lt n \leq M+N\text{,}\)
\begin{equation*}
\sum_{\alpha \in S} \left| \sum_{M \lt n\leq M+N} a_n \exp(2 \pi i \alpha
n) \right|^2 \leq (\delta^{-1} + N - 1) \sum_{M \lt n \leq M+N} |a_n|^2.
\end{equation*}
We prove here only the bound with the factor \(\delta^{-1} + N - 1\) replaced by \(\delta^{-1} + N\text{;}\) there is a fun trick to pick up the extra \(-1\) (see Exercise 16.4.4).
By Lemma 16.4, we may reduce to showing that for any \(z_\alpha \in \CC\text{,}\)
\begin{equation*}
\sum_{M \lt n \leq M+N} \left| \sum_{\alpha \in S} z_\alpha \exp(2 \pi i n \alpha)
\right|^2 \leq (\delta^{-1} + N) \sum_{\alpha \in S} |z_\alpha|^2.
\end{equation*}
When we expand the square on the left side, the diagonal terms contribute \(N \sum_\alpha |z_\alpha|^2\text{.}\) The off-diagonal terms give
\begin{equation*}
\sum_{\alpha \neq \beta} z_\alpha \overline{z_\beta}
\exp(2 \pi i K (\alpha - \beta)) \frac{\sin \pi N (\alpha - \beta)}{\sin
\pi (\alpha - \beta)}
\end{equation*}
for \(K = M + (N+1)/2\text{.}\) By Corollary 16.3, this is bounded by \(\delta^{-1} \sum_\alpha |z_\alpha|^2\text{.}\)
Exercises16.4Exercises
1.
Find the optimal constant in the large sieve inequality (16.1.1) when the vectors in \(V\) are taken to be unit vectors forming the corners of a regular simplex in \(\RR^n\) with center at the origin.
Apply the weak version to the \(\delta K\)-spaced points \((\alpha + k)/K\) for \(\alpha\) running over \(S\) and \(k\) running over \(\{1, \dots, K\}\text{,}\) and the values \(b_m\) being related to the original \(a_n\) via
\begin{equation*}
\sum_m b_m \exp(2 \pi i \alpha m) = \sum_n a_n \exp(2 \pi i K \alpha n).
\end{equation*}