Skip to main content

Chapter 12 Revisiting the sieve of Eratosthenes

This chapter begins the third part of the course, in which we will investigate a class of methods in analytic number theory known as sieve. Whereas the first two parts of the course leaned heavily on methods from complex analysis, here the emphasis will be more on analytic (in the classical sense) and combinatorial arguments.
One could write an entire book on sieve methods alone. The definitive one is [5].
Note: in ordinary English, a sieve is a device through which you pour a powder, like flour, to filter out large impurities. This process is called sifting, though in mathematical usage the term sieving also occurs.

Section 12.1 The sieve of Eratosthenes

In Chapter 2, we suggested that Euler's proof of the infinitude of primes can be seen as the genesis of complex-analytic methods in analytic number theory. By contrast, the sieve-theoretic methods we will consider in this part of the course have their roots in Euclid's original proof of the infinitude of primes.
Given the unique factorization property of primes, the sieve of Eratothenes is a natural strategy for determining which integers in some range of the form \(\{2, \dots, n\}\) are prime. The procedure is simple: as long as there are any integers in this range whose status as prime or composite is not yet decided, find the first undecided number \(p\text{;}\) mark \(p\) as prime; then mark \(2p, 3p, \dots\) as composite while this arithmetic progression stays within the range \(\{2,\dots,n\}\text{.}\)
In modern number theory, we refine this idea by looking at what happens if we don't run the full sieve. To begin with, one need only sift out multiples of primes up to \(n^{1/2}\) in order to leave only primes behind. However, even if one is only able to sift out multiples of primes up to \(n^\alpha\text{,}\) what remain are numbers with no prime factors less than \(n^\alpha\text{.}\) In particular, any such number has at most \(\lfloor \alpha^{-1} \rfloor\) prime factors, and so is in some sense “nearly prime”.
To turn such observations into quantitative conclusions, we must account more carefully for the fact that many numbers get sifted out more than once. The natural way to keep track of this multiple counting is the principle of inclusion-exclusion, which we turn to next.

Section 12.2 The principle of inclusion-exclusion

To formulate the principle of inclusion-exclusion, let \(S\) be a finite set and let \(P_1, \dots, P_n\) be subsets of \(S\text{.}\) We will think of each \(P_i\) as containing the elements of \(S\) with a certain property, such that it is relatively easy to count elements that have any fixed subset of the properties and possibly more, but what we really want is to count the elements of \(S\) with none of the properties. (For example, each property might be divisibility by a certain prime number.
If \(s \in S\) belongs to \(m\) of the subsets, then the number of times it gets counted on the right side is
\begin{equation*} \binom{m}{0} - \binom{m}{1} + \cdots \end{equation*}
which equals 1 if \(m=0\) and vanishes otherwise.
More generally, suppose we want to sum the values of \(f\colon S \to \CC\) over the elements of \(S\) with none of the given properties.
In our applications, typically \(S = \{1, \dots, N\}\) and \(P_1, P_2, \dots\) are the sets of multiples of certain small primes. It is convenient to rewrite the principle of inclusion-exclusion in terms of the Möbius function (see Definition 3.8).
While this can be deduced from Lemma 12.2, it can also be interpreted another way. Let \(A(s), B(s)\) be the Dirichlet series associated to the arithmetic functions \(n \mapsto a_n, n \mapsto b_n\text{;}\) then the first equation asserts that \(B = A * 1\) and the second that \(A = B * \mu\text{,}\) and the equivalence of the two is the statement that the constant function \(1\) and the Möbius function \(\mu\) are inverses with respect to the convolution product (Definition 3.8).

Section 12.3 Smooth numbers

Before proceeding, we need a quick lemma concerning smooth numbers.

Definition 12.4.

A natural number is \(z\)-smooth if its prime factors are all less than or equal to \(z\text{.}\)
If we expand the right side as a product of geometric series, we get a term \((x/n)^\delta \geq 1\) for each \(z\)-smooth number \(n \leq x\) (among other terms). This yields the claim.

Section 12.4 Back to Eratosthenes

Here is a modern version of the sieve of Eratosthenes, following [18].

Definition 12.6.

Let \(A\) be a set of natural numbers, and let \(P\) be a set of primes; also set
\begin{equation*} P(z) = \prod_{p \in P, p \leq z} p. \end{equation*}
For each \(p \in P\text{,}\) choose a set \(R_p\) consisting of some number \(\omega(p)\) of residue classes modulo \(p\text{,}\) and let \(A_p\) be the subset of \(A\) whose elements belong to the chosen residue classes. Put
\begin{equation*} W(z) = \prod_{p |P(z)} \left( 1 - \frac{\omega(p)}{p} \right), \end{equation*}
For \(d\) squarefree with all prime factors in \(P\text{,}\) put \(\omega(d) = \prod_{p|d} \omega(p)\) and \(A_d = \cap_{p | d} A_p\text{.}\) We wish to estimate \(S(A,P,z)\text{,}\) the number of elements of \(A\) not belonging to \(A_{p}\) for any \(p \leq z\text{.}\)
In order to say anything about \(S(A,P,z)\text{,}\) we must assume some good properties about the chosen residue classes. For starters, we want that for some \(\kappa > 0\text{,}\)
\begin{equation} \sum_{p \leq z, p \in P} \frac{\omega(p) \log p}{p} \leq \kappa \log z + O(1),\tag{12.4.1} \end{equation}
where the big-O bound is for \(z \to \infty\) and the constant depends on \(P, R_p, \kappa\text{.}\)
Put \(F_\omega(t,z)=\sum_{d \lt t, d |P(z)} \omega(d)\text{.}\) Then
\begin{equation} \sum_{d > Cx, d | P(z)} \frac{\omega(d)}{d} \leq \int_{Cx}^\infty \frac{F_\omega(t,z)}{t^2} dt\tag{12.4.2} \end{equation}
(exercise; see Exercise 12.6.2), so the result follows from Lemma 12.7.

Section 12.5 Motivation: the twin prime conjecture

One can even guess the correct asymptotic up to a constant factor, by a very simple argument: since the probability of a random number in \([1, \dots, N]\) being prime is asymptotically \(1/\log N\text{,}\) the number of twin primes in \([1,\dots,N]\) should be asymptotic to \(N/\log^2 N\text{.}\) (Getting the constant right is a bit trickier; see Chapter 20 for discussion.)
As a corollary of Theorem 12.9, we obtain the following result of Brun (with a slightly simpler proof; Brun originally used the combinatorial sieve to be introduced in Chapter 13). We will improve upon this result using the Selberg sieve later (Chapter 15).
We will apply Theorem 12.9 with \(A = \{1, \dots, x\}\) and \(P = \{p: 2 \lt p \leq z\}\text{.}\) For each \(p \in P\text{,}\) let \(R_p\) consist of the residue classes of \(0,-2\text{,}\) so that \(\omega(p) = 2\text{.}\) For \(d\) odd squarefree, \(\omega(d) = 2^{\nu(d)}\) for \(\nu(d)\) the number of prime factors of \(d\text{.}\) One checks easily (exercise; see (12.5.1)) that
\begin{equation} \left|\#A_d - x \frac{\omega(d)}{d} \right| \leq 2^{\nu(d)}.\tag{12.5.1} \end{equation}
Since
\begin{equation*} \sum_{p\leq z} \frac{\log p}{p} = O(\log z) \end{equation*}
from a prior homework, we can take \(\kappa = 2\) in Theorem 12.9. This yields
\begin{equation*} S(A,P,z) = x W(z) + O \left( x \log^3 z \exp \left( - \frac{\log x}{\log z} \right) \right), \end{equation*}
where the big-O constant does not depend on \(x\) or \(z\text{.}\) We now take
\begin{equation*} \log z = \frac{\log x}{A \log \log x} \end{equation*}
for a suitable constant \(A\text{.}\) Since
\begin{equation*} W(z) \leq \prod_{3 \leq p \leq z} \left(1 - \frac{1}{p} \right)^2 = O((\log z)^{-2}) \end{equation*}
by Exercise 5.5.7, we deduce that \(S(A,P,z)= O(x(\log \log x)^2/(\log x)^2)\text{.}\)
To conclude, note that \(S(A,P,z)\) includes all primes \(z+2 \leq p \leq x\) such that \(p+2\) is also prime. The number of twin primes up to \(x\) that we missed is at most \(z = x^{1/(A \log \log x)}\text{,}\) so this doesn't affect the claim.

Exercises 12.6 Exercises

5. (Brun).

Prove that the sum of the reciprocals of the twin primes converges.

6.

Prove that
\begin{equation*} \Phi(x,z) = O\left( x \log z \exp \left( - \frac{\log x}{\log z} \right) \right) \end{equation*}
where the big-O bound is for \(z \to \infty\text{,}\) uniformly in \(x\text{.}\)
Hint.
Apply Lemma 12.5 with \(\delta = 1 - (\log z)^{-1}\text{.}\)

7.

Prove that the number of squarefree integers in \(\{1, \dots, N\}\) is
\begin{equation*} \frac{6}{\pi^2} N + O(N^{1-\epsilon}) \end{equation*}
for some explicit value of \(\epsilon\text{.}\)
Hint.
This is much easier than sieving over primes! Just make sure to round no more than \(O(N^{1-\epsilon})\) fractions off to the nearest integer. Also, don't forget that \(6/\pi^2 = 1/\zeta(2) = \prod_p (1 - 1/p^2)\text{.}\)