This chapter begins the third part of the course, in which we will investigate a class of methods in analytic number theory known as sieve. Whereas the first two parts of the course leaned heavily on methods from complex analysis, here the emphasis will be more on analytic (in the classical sense) and combinatorial arguments.
One could write an entire book on sieve methods alone. The definitive one is [5].
Note: in ordinary English, a sieve is a device through which you pour a powder, like flour, to filter out large impurities. This process is called sifting, though in mathematical usage the term sieving also occurs.
Section12.1The sieve of Eratosthenes
In Chapter 2, we suggested that Euler's proof of the infinitude of primes can be seen as the genesis of complex-analytic methods in analytic number theory. By contrast, the sieve-theoretic methods we will consider in this part of the course have their roots in Euclid's original proof of the infinitude of primes.
Given the unique factorization property of primes, the sieve of Eratothenes is a natural strategy for determining which integers in some range of the form \(\{2, \dots, n\}\) are prime. The procedure is simple: as long as there are any integers in this range whose status as prime or composite is not yet decided, find the first undecided number \(p\text{;}\) mark \(p\) as prime; then mark \(2p, 3p, \dots\) as composite while this arithmetic progression stays within the range \(\{2,\dots,n\}\text{.}\)
In modern number theory, we refine this idea by looking at what happens if we don't run the full sieve. To begin with, one need only sift out multiples of primes up to \(n^{1/2}\) in order to leave only primes behind. However, even if one is only able to sift out multiples of primes up to \(n^\alpha\text{,}\) what remain are numbers with no prime factors less than \(n^\alpha\text{.}\) In particular, any such number has at most \(\lfloor \alpha^{-1} \rfloor\) prime factors, and so is in some sense “nearly prime”.
To turn such observations into quantitative conclusions, we must account more carefully for the fact that many numbers get sifted out more than once. The natural way to keep track of this multiple counting is the principle of inclusion-exclusion, which we turn to next.
Section12.2The principle of inclusion-exclusion
To formulate the principle of inclusion-exclusion, let \(S\) be a finite set and let \(P_1, \dots, P_n\) be subsets of \(S\text{.}\) We will think of each \(P_i\) as containing the elements of \(S\) with a certain property, such that it is relatively easy to count elements that have any fixed subset of the properties and possibly more, but what we really want is to count the elements of \(S\) with none of the properties. (For example, each property might be divisibility by a certain prime number.
Lemma12.1.Inclusion-exclusion.
For \(S\) a finite set and \(P_1,\dots,P_n\) some subsets of \(S\text{,}\)
In our applications, typically \(S = \{1, \dots, N\}\) and \(P_1, P_2, \dots\) are the sets of multiples of certain small primes. It is convenient to rewrite the principle of inclusion-exclusion in terms of the Möbius function (see Definition 3.8).
Lemma12.3.Möbius inversion.
Let \(\{a_n\}_{n=1}^\infty, \{b_n\}_{n=1}^\infty\) be two sequences of complex numbers such that
While this can be deduced from Lemma 12.2, it can also be interpreted another way. Let \(A(s), B(s)\) be the Dirichlet series associated to the arithmetic functions \(n \mapsto a_n, n \mapsto b_n\text{;}\) then the first equation asserts that \(B = A * 1\) and the second that \(A = B * \mu\text{,}\) and the equivalence of the two is the statement that the constant function \(1\) and the Möbius function \(\mu\) are inverses with respect to the convolution product (Definition 3.8).
Section12.3Smooth numbers
Before proceeding, we need a quick lemma concerning smooth numbers.
Definition12.4.
A natural number is \(z\)-smooth if its prime factors are all less than or equal to \(z\text{.}\)
Lemma12.5.Rankin.
Let \(\Phi(x,z)\) be the number of \(z\)-smooth numbers less than or equal to \(x\text{.}\) Then for any \(\delta > 0\text{,}\)
If we expand the right side as a product of geometric series, we get a term \((x/n)^\delta \geq 1\) for each \(z\)-smooth number \(n \leq x\) (among other terms). This yields the claim.
Section12.4Back to Eratosthenes
Here is a modern version of the sieve of Eratosthenes, following [18].
Definition12.6.
Let \(A\) be a set of natural numbers, and let \(P\) be a set of primes; also set
\begin{equation*}
P(z) = \prod_{p \in P, p \leq z} p.
\end{equation*}
For each \(p \in P\text{,}\) choose a set \(R_p\) consisting of some number \(\omega(p)\) of residue classes modulo \(p\text{,}\) and let \(A_p\) be the subset of \(A\) whose elements belong to the chosen residue classes. Put
For \(d\) squarefree with all prime factors in \(P\text{,}\) put \(\omega(d) = \prod_{p|d} \omega(p)\) and \(A_d = \cap_{p | d} A_p\text{.}\) We wish to estimate \(S(A,P,z)\text{,}\) the number of elements of \(A\) not belonging to \(A_{p}\) for any \(p \leq z\text{.}\)
In order to say anything about \(S(A,P,z)\text{,}\) we must assume some good properties about the chosen residue classes. For starters, we want that for some \(\kappa > 0\text{,}\)
\begin{equation}
\sum_{p \leq z, p \in P} \frac{\omega(p) \log p}{p} \leq \kappa \log z + O(1),\tag{12.4.1}
\end{equation}
where the big-O bound is for \(z \to \infty\) and the constant depends on \(P, R_p, \kappa\text{.}\)
With notation as in Definition 12.6, fix \(P, R_p, \kappa\) satisfying (12.4.1), and also fix \(C,c > 0\text{.}\) Then for any set \(A\) and any \(X,x>0\) such that
\begin{equation*}
\left|\#A_d - \frac{\omega(d)}{d} X \right| \leq c \omega(d)
\end{equation*}
and \(\#A_d = 0\) for \(d > Cx\text{,}\) we have
\begin{equation*}
S(A,P,z) = X W(z) + O \left( x \log^{\kappa+1} z \exp
\left(- \frac{\log x}{\log z} \right) \right),
\end{equation*}
where the big-O bound is for \(z \to \infty\text{,}\) uniformly in \(A,x,X\text{.}\)
There are infinitely many primes \(p\) such that \(p+2\) is also prime.
One can even guess the correct asymptotic up to a constant factor, by a very simple argument: since the probability of a random number in \([1, \dots, N]\) being prime is asymptotically \(1/\log N\text{,}\) the number of twin primes in \([1,\dots,N]\) should be asymptotic to \(N/\log^2 N\text{.}\) (Getting the constant right is a bit trickier; see Chapter 20 for discussion.)
As a corollary of Theorem 12.9, we obtain the following result of Brun (with a slightly simpler proof; Brun originally used the combinatorial sieve to be introduced in Chapter 13). We will improve upon this result using the Selberg sieve later (Chapter 15).
Theorem12.11.
The number of primes \(p \leq x\) such that \(p+2\) is also prime is \(O(x (\log \log x)^2 / (\log x)^2)\text{.}\)
We will apply Theorem 12.9 with \(A = \{1, \dots, x\}\) and \(P = \{p: 2 \lt p \leq z\}\text{.}\) For each \(p \in P\text{,}\) let \(R_p\) consist of the residue classes of \(0,-2\text{,}\) so that \(\omega(p) = 2\text{.}\) For \(d\) odd squarefree, \(\omega(d) = 2^{\nu(d)}\) for \(\nu(d)\) the number of prime factors of \(d\text{.}\) One checks easily (exercise; see (12.5.1)) that
\begin{equation}
\left|\#A_d - x \frac{\omega(d)}{d} \right| \leq 2^{\nu(d)}.\tag{12.5.1}
\end{equation}
by Exercise 5.5.7, we deduce that \(S(A,P,z)= O(x(\log \log x)^2/(\log x)^2)\text{.}\)
To conclude, note that \(S(A,P,z)\) includes all primes \(z+2 \leq p \leq x\) such that \(p+2\) is also prime. The number of twin primes up to \(x\) that we missed is at most \(z = x^{1/(A \log \log x)}\text{,}\) so this doesn't affect the claim.
This is much easier than sieving over primes! Just make sure to round no more than \(O(N^{1-\epsilon})\) fractions off to the nearest integer. Also, don't forget that \(6/\pi^2 = 1/\zeta(2) = \prod_p (1 - 1/p^2)\text{.}\)