Revisiting the sieve of Eratosthenes

Chapter 12 Revisiting the sieve of Eratosthenes

This chapter begins the third part of the course, in which we will investigate a class of methods in analytic number theory known as sieve. Whereas the first two parts of the course leaned heavily on methods from complex analysis, here the emphasis will be more on analytic (in the classical sense) and combinatorial arguments.

One could write an entire book on sieve methods alone. The definitive one is [5].

Note: in ordinary English, a sieve is a device through which you pour a powder, like flour, to filter out large impurities. This process is called sifting, though in mathematical usage the term sieving also occurs.

Section 12.1 The sieve of Eratosthenes

In Chapter 2, we suggested that Euler's proof of the infinitude of primes can be seen as the genesis of complex-analytic methods in analytic number theory. By contrast, the sieve-theoretic methods we will consider in this part of the course have their roots in Euclid's original proof of the infinitude of primes.

Given the unique factorization property of primes, the sieve of Eratothenes is a natural strategy for determining which integers in some range of the form \(\{2, \dots, n\}\) are prime. The procedure is simple: as long as there are any integers in this range whose status as prime or composite is not yet decided, find the first undecided number \(p\text{;}\) mark \(p\) as prime; then mark \(2p, 3p, \dots\) as composite while this arithmetic progression stays within the range \(\{2,\dots,n\}\text{.}\)

In modern number theory, we refine this idea by looking at what happens if we don't run the full sieve. To begin with, one need only sift out multiples of primes up to \(n^{1/2}\) in order to leave only primes behind. However, even if one is only able to sift out multiples of primes up to \(n^\alpha\text{,}\) what remain are numbers with no prime factors less than \(n^\alpha\text{.}\) In particular, any such number has at most \(\lfloor \alpha^{-1} \rfloor\) prime factors, and so is in some sense “nearly prime”.

To turn such observations into quantitative conclusions, we must account more carefully for the fact that many numbers get sifted out more than once. The natural way to keep track of this multiple counting is the principle of inclusion-exclusion, which we turn to next.

Section 12.2 The principle of inclusion-exclusion

To formulate the principle of inclusion-exclusion, let \(S\) be a finite set and let \(P_1, \dots, P_n\) be subsets of \(S\text{.}\) We will think of each \(P_i\) as containing the elements of \(S\) with a certain property, such that it is relatively easy to count elements that have any fixed subset of the properties and possibly more, but what we really want is to count the elements of \(S\) with none of the properties. (For example, each property might be divisibility by a certain prime number.

Lemma 12.1. Inclusion-exclusion.

For \(S\) a finite set and \(P_1,\dots,P_n\) some subsets of \(S\text{,}\)

\begin{equation*} \#(S \setminus (P_1 \cup \cdots \cup P_n)) = \sum_{T \subseteq \{1,\dots,n\}} (-1)^{\#T} \# \left( \bigcap_{t \in T} P_t \right). \end{equation*}

Proof.

If \(s \in S\) belongs to \(m\) of the subsets, then the number of times it gets counted on the right side is

\begin{equation*} \binom{m}{0} - \binom{m}{1} + \cdots \end{equation*}

which equals 1 if \(m=0\) and vanishes otherwise.

More generally, suppose we want to sum the values of \(f\colon S \to \CC\) over the elements of \(S\) with none of the given properties.

Lemma 12.2. Weighted inclusion-exclusion.

For \(S\) a finite set, \(P_1,\dots,P_n\) some subsets of \(S\text{,}\) and \(f\colon S \to \CC\) any function,

\begin{equation*} \sum_{s \in S \setminus (P_1 \cup \cdots \cup P_n)} f(s) = \sum_{T \subseteq \{1,\dots,n\}} (-1)^{\#T} \left( \sum_{s \in \cap_{t \in T} P_t} f(s) \right). \end{equation*}

Proof.

The proof of Lemma 12.1 immediately implies this.

In our applications, typically \(S = \{1, \dots, N\}\) and \(P_1, P_2, \dots\) are the sets of multiples of certain small primes. It is convenient to rewrite the principle of inclusion-exclusion in terms of the Möbius function (see Definition 3.8).

Lemma 12.3. Möbius inversion.

Let \(\{a_n\}_{n=1}^\infty, \{b_n\}_{n=1}^\infty\) be two sequences of complex numbers such that

\begin{equation*} b_n = \sum_{d|n} a_{n/d}. \end{equation*}

Then

\begin{equation*} a_n = \sum_{d|n} \mu(d) b_{n/d}. \end{equation*}

Proof.

While this can be deduced from Lemma 12.2, it can also be interpreted another way. Let \(A(s), B(s)\) be the Dirichlet series associated to the arithmetic functions \(n \mapsto a_n, n \mapsto b_n\text{;}\) then the first equation asserts that \(B = A * 1\) and the second that \(A = B * \mu\text{,}\) and the equivalence of the two is the statement that the constant function \(1\) and the Möbius function \(\mu\) are inverses with respect to the convolution product (Definition 3.8).

Section 12.3 Smooth numbers

Before proceeding, we need a quick lemma concerning smooth numbers.

Definition 12.4.

A natural number is \(z\)-smooth if its prime factors are all less than or equal to \(z\text{.}\)

Lemma 12.5. Rankin.

Let \(\Phi(x,z)\) be the number of \(z\)-smooth numbers less than or equal to \(x\text{.}\) Then for any \(\delta > 0\text{,}\)

\begin{equation*} \Phi(x,z) \leq x^\delta \prod_{p \leq z} (1 - p^{-\delta})^{-1}. \end{equation*}

Proof.

If we expand the right side as a product of geometric series, we get a term \((x/n)^\delta \geq 1\) for each \(z\)-smooth number \(n \leq x\) (among other terms). This yields the claim.

Section 12.4 Back to Eratosthenes

Here is a modern version of the sieve of Eratosthenes, following [18].

Definition 12.6.

Let \(A\) be a set of natural numbers, and let \(P\) be a set of primes; also set

\begin{equation*} P(z) = \prod_{p \in P, p \leq z} p. \end{equation*}

For each \(p \in P\text{,}\) choose a set \(R_p\) consisting of some number \(\omega(p)\) of residue classes modulo \(p\text{,}\) and let \(A_p\) be the subset of \(A\) whose elements belong to the chosen residue classes. Put

\begin{equation*} W(z) = \prod_{p |P(z)} \left( 1 - \frac{\omega(p)}{p} \right), \end{equation*}

For \(d\) squarefree with all prime factors in \(P\text{,}\) put \(\omega(d) = \prod_{p|d} \omega(p)\) and \(A_d = \cap_{p | d} A_p\text{.}\) We wish to estimate \(S(A,P,z)\text{,}\) the number of elements of \(A\) not belonging to \(A_{p}\) for any \(p \leq z\text{.}\)

In order to say anything about \(S(A,P,z)\text{,}\) we must assume some good properties about the chosen residue classes. For starters, we want that for some \(\kappa > 0\text{,}\)

\begin{equation} \sum_{p \leq z, p \in P} \frac{\omega(p) \log p}{p} \leq \kappa \log z + O(1),\tag{12.4.1} \end{equation}

where the big-O bound is for \(z \to \infty\) and the constant depends on \(P, R_p, \kappa\text{.}\)

Lemma 12.7.

With notation as in Definition 12.6 and assuming (12.4.1), we have

\begin{equation*} \sum_{d \lt t, d |P(z)} \omega(d) = O\left( t (\log z)^\kappa \exp\left( - \frac{\log t}{\log z} \right) \right), \end{equation*}

where the big-O bound is for \(z \to \infty\) and the constant depends on \(P, R_p, \kappa\text{.}\)

Proof.

Exercise (Exercise 12.6.1).

Lemma 12.8.

Fix \(C>0\text{.}\) With notation as in Definition 12.6 and assuming (12.4.1), we have

\begin{equation*} \sum_{d > Cx, d | P(z)} \frac{\omega(d)}{d} = O\left( (\log z)^{\kappa + 1} \exp\left( - \frac{\log x}{\log z} \right) \right), \end{equation*}

where the big-O bound is for \(z \to \infty\) and the constant depends on \(P, R_p, \kappa, C\text{.}\)

Proof.

Put \(F_\omega(t,z)=\sum_{d \lt t, d |P(z)} \omega(d)\text{.}\) Then

\begin{equation} \sum_{d > Cx, d | P(z)} \frac{\omega(d)}{d} \leq \int_{Cx}^\infty \frac{F_\omega(t,z)}{t^2} dt\tag{12.4.2} \end{equation}

(exercise; see Exercise 12.6.2), so the result follows from Lemma 12.7.

Theorem 12.9.

With notation as in Definition 12.6, fix \(P, R_p, \kappa\) satisfying (12.4.1), and also fix \(C,c > 0\text{.}\) Then for any set \(A\) and any \(X,x>0\) such that

\begin{equation*} \left|\#A_d - \frac{\omega(d)}{d} X \right| \leq c \omega(d) \end{equation*}

and \(\#A_d = 0\) for \(d > Cx\text{,}\) we have

\begin{equation*} S(A,P,z) = X W(z) + O \left( x \log^{\kappa+1} z \exp \left(- \frac{\log x}{\log z} \right) \right), \end{equation*}

where the big-O bound is for \(z \to \infty\text{,}\) uniformly in \(A,x,X\text{.}\)

Proof.

Exercise (Exercise 12.6.3).

Section 12.5 Motivation: the twin prime conjecture

Conjecture 12.10. Twin prime conjecture.

There are infinitely many primes \(p\) such that \(p+2\) is also prime.

One can even guess the correct asymptotic up to a constant factor, by a very simple argument: since the probability of a random number in \([1, \dots, N]\) being prime is asymptotically \(1/\log N\text{,}\) the number of twin primes in \([1,\dots,N]\) should be asymptotic to \(N/\log^2 N\text{.}\) (Getting the constant right is a bit trickier; see Chapter 20 for discussion.)

As a corollary of Theorem 12.9, we obtain the following result of Brun (with a slightly simpler proof; Brun originally used the combinatorial sieve to be introduced in Chapter 13). We will improve upon this result using the Selberg sieve later (Chapter 15).

Theorem 12.11.

The number of primes \(p \leq x\) such that \(p+2\) is also prime is \(O(x (\log \log x)^2 / (\log x)^2)\text{.}\)

Proof.

We will apply Theorem 12.9 with \(A = \{1, \dots, x\}\) and \(P = \{p: 2 \lt p \leq z\}\text{.}\) For each \(p \in P\text{,}\) let \(R_p\) consist of the residue classes of \(0,-2\text{,}\) so that \(\omega(p) = 2\text{.}\) For \(d\) odd squarefree, \(\omega(d) = 2^{\nu(d)}\) for \(\nu(d)\) the number of prime factors of \(d\text{.}\) One checks easily (exercise; see (12.5.1)) that

\begin{equation} \left|\#A_d - x \frac{\omega(d)}{d} \right| \leq 2^{\nu(d)}.\tag{12.5.1} \end{equation}

Since

\begin{equation*} \sum_{p\leq z} \frac{\log p}{p} = O(\log z) \end{equation*}

from a prior homework, we can take \(\kappa = 2\) in Theorem 12.9. This yields

\begin{equation*} S(A,P,z) = x W(z) + O \left( x \log^3 z \exp \left( - \frac{\log x}{\log z} \right) \right), \end{equation*}

where the big-O constant does not depend on \(x\) or \(z\text{.}\) We now take

\begin{equation*} \log z = \frac{\log x}{A \log \log x} \end{equation*}

for a suitable constant \(A\text{.}\) Since

\begin{equation*} W(z) \leq \prod_{3 \leq p \leq z} \left(1 - \frac{1}{p} \right)^2 = O((\log z)^{-2}) \end{equation*}

by Exercise 5.5.7, we deduce that \(S(A,P,z)= O(x(\log \log x)^2/(\log x)^2)\text{.}\)

To conclude, note that \(S(A,P,z)\) includes all primes \(z+2 \leq p \leq x\) such that \(p+2\) is also prime. The number of twin primes up to \(x\) that we missed is at most \(z = x^{1/(A \log \log x)}\text{,}\) so this doesn't affect the claim.

Exercises 12.6 Exercises

1.

Prove Lemma 12.7 using Rankin's trick.

5. (Brun).

Prove that the sum of the reciprocals of the twin primes converges.

6.

Prove that

\begin{equation*} \Phi(x,z) = O\left( x \log z \exp \left( - \frac{\log x}{\log z} \right) \right) \end{equation*}

where the big-O bound is for \(z \to \infty\text{,}\) uniformly in \(x\text{.}\)

Hint.

Apply Lemma 12.5 with \(\delta = 1 - (\log z)^{-1}\text{.}\)

7.

Prove that the number of squarefree integers in \(\{1, \dots, N\}\) is

\begin{equation*} \frac{6}{\pi^2} N + O(N^{1-\epsilon}) \end{equation*}

for some explicit value of \(\epsilon\text{.}\)

Hint.

This is much easier than sieving over primes! Just make sure to round no more than \(O(N^{1-\epsilon})\) fractions off to the nearest integer. Also, don't forget that \(6/\pi^2 = 1/\zeta(2) = \prod_p (1 - 1/p^2)\text{.}\)

Notes on analytic number theory

Search Results:

Chapter 12 Revisiting the sieve of Eratosthenes

Section 12.1 The sieve of Eratosthenes

Section 12.2 The principle of inclusion-exclusion

Lemma 12.1. Inclusion-exclusion.

Proof.

Lemma 12.2. Weighted inclusion-exclusion.

Proof.

Lemma 12.3. Möbius inversion.

Proof.

Section 12.3 Smooth numbers

Definition 12.4.

Lemma 12.5. Rankin.

Proof.

Section 12.4 Back to Eratosthenes

Definition 12.6.

Lemma 12.7.

Proof.

Lemma 12.8.

Proof.

Theorem 12.9.

Proof.

Section 12.5 Motivation: the twin prime conjecture

Conjecture 12.10. Twin prime conjecture.

Theorem 12.11.

Proof.

Exercises 12.6 Exercises

1.

2.

3.

4.

5. (Brun).

6.

7.