Section 13.1 Sieve setup
Let \(f: \NN \to \CC\) be an arithmetic function, and suppose we want to estimate the sum of \(f\) over primes. More precisely, let \(P\) be a set of primes, and put
\begin{equation*}
P(z) = \prod_{p \leq z, p \in P} p.
\end{equation*}
If we define
\begin{align*}
S(x,z) \amp= \sum_{n \leq x, (n, P(z)) = 1} f(n),\\
A_d(x) \amp= \sum_{n \leq x, n \equiv 0 \pmod{d}} f(n)
\end{align*}
(with the dependence on \(P\) and \(f\) suppressed from the notation), we have
\begin{equation*}
S(x,z) = \sum_{d | P(z)} \mu(d) A_d(x).
\end{equation*}
As before, suppose there is a multiplicative function \(g\) such that for \(d\) squarefree with all prime factors in \(P\text{,}\)
\begin{equation*}
A_d(x) = g(d) X + r_d(x),
\end{equation*}
with \(X = X(x)\) independent of \(d\text{,}\) and the error term \(r_d(x)\) small when \(d\) is small relative to \(x\) (in a sense to be made precise later). Suppose further that
\begin{equation}
g(p) \in [0,1) \quad (p \in P); \qquad g(p) = 0 \quad (p \notin P).\tag{13.1.1}
\end{equation}
(If we need to take \(g(p) = 1\text{,}\) then we cannot expect to get much of a contribution from numbers not divisible by \(p\text{;}\) we should resign ourselves to this, and instead remove \(p\) from \(P\text{.}\)) Then we can rewrite
\begin{align*}
S(x,z) \amp= V(z)X + R(x,z)\\
V(z) \amp= \prod_{p | P(z)} (1 - g(p))\\
R(x,z) \amp= \sum_{d | P(z)} \mu(d) r_d(x).
\end{align*}
If \(z\) is small relative to \(x\text{,}\) which in practice will mean \(z \lt x^\alpha\) for some cutoff \(\alpha \in (0,1)\text{,}\) we may be able to show that the main term \(V(z)X\) dominates the error term \(R(x,z)\text{.}\) Again, the main term is what you would predict from the heuristic that if an integer is chosen randomly, its divisibilities by different primes should act like independent random events.
For instance, if \(P\) is the set of all primes and \(z \geq x^{1/2}\text{,}\) then \(S(x,z) = \sum_{p \leq x} f(p)\text{.}\) If \(f\) is the function
\begin{equation*}
f(n) = \begin{cases} 1 \amp \mbox{ prime} \\ 0 \amp \mbox{otherwise},
\end{cases}
\end{equation*}
then by the error term in the prime number theorem for arithmetic progressions (
Theorem 11.11),
\begin{equation*}
r_d(x) = O(x \log^{-A} x)
\end{equation*}
for any fixed \(A>0\text{.}\) (It is now important that we have that bound uniformly in \(d\text{!}\)) Also, \(S(x,x^{1/2})\) counts twin primes up to \(x\text{,}\) whereas \(S(x,x^{1/(N+1)})\) counts primes \(p\) such that \(p+2\) has no prime factor less than \(x^{1/(N+1)}\text{,}\) and hence has at most \(N\) prime factors.
Section 13.2 Brun's combinatorial sieve
We would like somewhat finer control than was provided by the sieve of Eratosthenes; the trouble is that \(R(x,z)\) has too many terms for us to be able to control it.
Brun's approach to get around this is to truncate the Möbius function by restricting it to suitable subsets \(D^+\) and \(D^-\text{,}\) subject to the restriction that for \(n\) a product of primes in \(P\text{,}\) the incomplete convolutions
\begin{equation*}
\delta^+(n) = \sum_{d|n, d\in D^+} \mu(d), \qquad
\delta^-(n) = \sum_{d|n, d\in D^-} \mu(d)
\end{equation*}
satisfy
\begin{equation}
\delta^-(n) \leq \delta(n) \leq \delta^+(n)\tag{13.2.1}
\end{equation}
for
\begin{equation*}
\delta(n) = \sum_{d|n} \mu(d) = \begin{cases} 1 \amp n = 1 \\ 0 \amp n > 1.
\end{cases}
\end{equation*}
One such choice would be to take
\(D^+\) and
\(D^-\) to consist of all squarefree numbers whose number of distinct prime factors is even or odd, respectively. This choice is much too crude; we should instead make a choice that allows some cancellation in
\(\delta^-\) and
\(\delta^+\) without messing up the inequality
(13.2.1). Moreover, we want to restrict
\(D^+\) and
\(D^-\) to be subsets of
\(\{1,\dots, y\}\) for some
\(y\) which is not too large compared to
\(x\text{.}\)
Let \(\lambda^+(d)\) and \(\lambda^-(d)\) denote the functions which agree with \(\mu\) on \(D^+\) and \(D^-\text{,}\) respectively, and are zero elsewhere. Put
\begin{align*}
V^{\pm}(z) \amp= \sum_{d | P(z)} \lambda^{\pm}(d) g(d)\\
R^{\pm}(x,z) \amp= \sum_{d | P(z)} \lambda^{\pm}(d) r_d(x).
\end{align*}
\begin{equation}
V^-(z) X + R^-(x,z) \leq S(x,z) \leq V^+(z) X + R^+(x,z).\tag{13.2.2}
\end{equation}
It is not at all obvious how one can usefully arrange for
\(D^+, D^-\) to satisfy
(13.2.1); here is Brun's choice. For
\(d\) a squarefree positive integer, write
\(d = p_1 \cdots p_r\) with
\(p_1 > \cdots > p_r\text{.}\) Set
\begin{align*}
D^+ \amp= \{d = p_1 \cdots p_r: p_m \lt y_m \quad \mbox{ odd} \}\\
D^- \amp= \{d = p_1 \cdots p_r: p_m \lt y_m \quad \mbox{ even} \},
\end{align*}
where \(y_1, y_2, \dots\) are certain parameters which may depend on \(d\text{.}\) (By convention, \(1 \in D^{\pm}\text{.}\)) We then have the following.
Lemma 13.1.
With notation as above, let \(V_n(z)\) be the sum of \(g(p_1 \cdots p_n) V(p_n)\) over sequences \(p_1 > \cdots > p_n\) of primes such that:
\(p_1 \lt z\text{;}\)
\(p_n \geq y_n\text{;}\)
\(p_m \lt y_m\) for \(m \lt n\) with \(m \equiv n \pmod{2}\text{.}\)
Then
\begin{align*}
V(z) \amp= V^+(z) - \sum_{n \equiv 1 \,(2)} V_n(z)\\
V(z) \amp= V^-(z) + \sum_{n \equiv 0 \,(2)} V_n(z)
\end{align*}
and so
\begin{equation}
V^-(z) \leq V(z) \leq V^+(z).\tag{13.2.3}
\end{equation}
Proof.
In particular, for a given
\(n\text{,}\) we deduce
(13.2.1) from
(13.2.3) by rigging up the set
\(P\) so that
\(P(z) = n\) and putting
\(g(d) = 1\) for all
\(d\text{.}\)
The functions \(\lambda^+\) and \(\lambda^-\) given above are together called the combinatorial sieve with parameters \(y_1, y_2, \dots\text{.}\) To use the combinatorial sieve, one must bound
\begin{equation*}
R(x,y) = \sum_{d \lt y, d | P(z)} |r_d(x)|,
\end{equation*}
for
\(y\) such that
\(D^{\pm} \subset \{1, \dots, y\}\text{;}\) in this case
\(R(x,y) \geq |R^{\pm}(x,z)|\text{,}\) giving error bounds in
(13.2.2). One must also bound
\(V^{\pm}(z)\text{.}\)
Section 13.3 Setting some parameters
To turn this into an actual numerical theorem, we must set the sieve parameters; we do this following
[13]. Remember that we may allow the
\(y_i\) to depend on
\(d\text{.}\)
Write \(d = p_1 \cdots p_r\) with \(p_1 > \cdots > p_r\text{;}\) we now take
\begin{equation*}
y_m = (y / (p_1 \cdots p_m))^{1/\beta},
\end{equation*}
where \(\beta > 1\) will be specified later. This makes it clear that all elements of \(D^+ \cup D^-\) belong to \(\{1, \dots, y\}\) except possibly for single primes in \(D^-\text{.}\) We can remedy this by requiring \(z \leq y\text{;}\) more precisely, we will take \(z = y^{1/s}\) for some \(s \geq \beta\text{.}\)
We will also need to make some restriction on the multiplicative function \(g\text{.}\) Namely, we assume that for some \(K>1\) and \(\kappa > 0\text{,}\) we have for all \(w,z\text{,}\)
\begin{equation}
\prod_{w \leq p \lt z} (1-g(p))^{-1} \leq K \left( \frac{\log z}{\log w} \right)^\kappa.\tag{13.3.1}
\end{equation}
We refer to \(\kappa\) as a sieve dimension of the function \(g\text{.}\) This number is quite critical; it will determine how large we can make \(z\) compared to \(y\text{,}\) which determines how many small primes we can use for sieving.
Section 13.4 Bounding the main term
We need an upper bound on \(V^+(z)\) and a lower bound on \(V^-(z)\text{;}\) we get both of these by getting an upper bound on \(V_n(z)\text{.}\) First, let us simplify the sum by relaxing the summation conditions. We claim that for any tuple \(p_1, \dots, p_n\) appearing in the sum defining \(V_n(z)\text{,}\) and any \(m \lt n\text{,}\)
\begin{equation}
p_1 \cdots p_{m-1} p_m^\beta \lt y.\tag{13.4.1}
\end{equation}
Namely, if \(m \equiv n \pmod{2}\text{,}\) we have the stronger inequality
\begin{equation*}
p_1 \cdots p_{m-1} p_m^{1+\beta} \lt y.
\end{equation*}
If \(m > 1\) and \(m \not\equiv n \pmod{2}\text{,}\) we have
\begin{equation*}
p_1 \cdots p_{m-1} p_m^\beta \lt p_1 \cdots p_{m-2} p_{m-1}^{1+\beta} \lt y.
\end{equation*}
Finally, if \(m = 1\) and \(m \not\equiv n \pmod{2}\text{,}\) we have
\begin{equation*}
p_1^\beta \lt z^\beta = y^{\beta/s} \leq y.
\end{equation*}
From
(13.4.1), we deduce by induction on
\(m\) that
\begin{equation*}
p_1 \cdots p_m \lt y^{1 - (1 - \beta^{-1})^m} \qquad (m=1,\dots,n-1).
\end{equation*}
In particular,
\begin{equation*}
p_n \geq (y/(p_1\cdots p_{n-1}))^{1/(\beta+1)}
\geq y^{\frac{1}{\beta+1} (1 - \beta^{-1})^{n-1}}
\geq y^{\frac{1}{\beta} (1 - \beta^{-1})^{n}}
\geq z_n
\end{equation*}
if we put
\begin{equation*}
z_n = z^{(1-\beta^{-1})^n}.
\end{equation*}
We will now retain only the conditions \(z > p_1 >\cdots > p_n \geq z_n\) on the primes, which will make the sum bigger because every summand is nonnegative. That is,
\begin{align*}
V_n(z) \amp\leq \sum_{z>p_1>\cdots>p_n \geq z_n} g(p_1\cdots p_n) V(p_n)\\
\amp\leq \frac{1}{n!} V(z_n) \left( \sum_{z_n \leq p \lt z} g(p) \right)^n\\
\amp\leq \frac{1}{n!} V(z_n) \left( \log \frac{V(z_n)}{V(z)} \right)^n.
\end{align*}
Here is where we need the assumption
(13.3.1) about the sieve dimension. It implies
\begin{equation*}
\frac{V(z_n)}{V(z)} \leq K (1 + (\beta-1)^{-1})^{\kappa n} \lt K e^{n/b}
\end{equation*}
for \(\beta = \kappa b + 1\) (using the bound \(1+x \leq e^x\) for \(x = (\beta-1)^{-1} = 1/(\kappa b)\)), which gives us
\begin{align*}
V_n(z) \amp\lt \frac{K}{n!} \left( \frac{n}{b} + \log K \right)^n e^{n/b} V(z)\\
\amp\leq \frac{K}{n!} \left( \frac{n}{b} e^{1/b} \right)^n K^b V(z)
\end{align*}
(using the bound \(1+x \leq e^x\) for \(x = b (\log K)/n\)). Since \(n! \geq e (n/e)^n\) (by taking logs and comparing integrals), we obtain
\begin{equation*}
V_n(z) \lt e^{-1} a^n K^{b+1} V(z)
\end{equation*}
for \(a = b^{-1} e^{1+b^{-1}}\text{.}\)
To conclude, we clean things up a bit. Remember that we were at liberty to choose \(\beta > 1\text{,}\) which is equivalent to choosing \(b > 0\text{.}\) By taking \(b\) sufficiently large, we can force \(a \lt 1\text{;}\) for instance, we could take \(b = 9\) to get \(a \lt e^{-1}\text{.}\) Note also that because
\begin{equation*}
p_1 > \cdots > p_n \geq
y_n = (y/(p_1 \cdots p_n))^{1/\beta},
\end{equation*}
we have \(p_1^{n+\beta} > y\text{.}\) Since we also have \(p_1 \lt z = y^{1/s}\text{,}\) we deduce that \(V_n(z) = 0\) unless \(n + \beta > s\text{.}\) Therefore
\begin{equation*}
\sum_{n>0} V_n(z) = \sum_{n>s-\beta} V_n(z) \lt \frac{a^{s-\beta}}{e(1-a)}
K^{b+1} V(z).
\end{equation*}
To conclude, we have the following bound.
Theorem 13.2. Combinatorial sieve.
In the combinatorial sieve with parameters \(y_1, y_2, \dots\) as above, and \(\beta = 9\kappa + 1\text{,}\) for any multiplicative function \(g(d)\) satisfying (13.1.1) and (13.3.1) for a given \(K\text{,}\) and any \(s \geq \beta\text{,}\) for \(z = y^{1/s}\) we have
\begin{align*}
V^+(z) \amp\lt (1 + e^{\beta -s}K^{10})V(z)\\
V^-(z) \amp> (1 + e^{\beta-s} K^{10})V(z).
\end{align*}
Consequently,
\begin{equation*}
(1-e^{\beta-s}K^{10}) V(z) X - R(x,z^s)
\leq S(x,z) \leq
(1+e^{\beta-s}K^{10}) V(z) X + R(x,z^s).
\end{equation*}
Proof.
As above, or see [13], Theorem 6.1.
Section 13.5 Consequences for twin almost-primes
Again consider the example
\begin{equation*}
f(n) = \begin{cases} 1 \amp \mbox{ prime} \\ 0 \amp \mbox{otherwise}.
\end{cases}
\end{equation*}
Theorem 13.3.
There are infinitely many primes \(p\) such that \(p+2\) is the product of at most twenty distinct primes.Proof.
Exercise (see Exercise 13.6.2).
Using a refinement of the sieving method, Chen was able to prove the following.
Theorem 13.4. (Chen).
There are infinitely many primes \(p\) such that \(p+2\) is the product of at most two distinct primes.Proof.
The original reference is [1], which also includes the proof of a similar weakening of the Goldbach conjecture (every sufficiently large even integer is the sum of a prime and product of at most two distinct primes). For a modern exposition, see [5], Theorem 25.10.
While
Theorem 13.4 is tantalizingly close to the twin prime conjecture, it seems that sieving methods fall short of delivering that particular prize.
One can also use
Theorem 13.2 to deduce that the number of twin primes
\(\leq x\) is
\(O(x /\log^2 x)\text{;}\) however, since this is a question about an upper bound rather than a lower bound, it is much simpler to derive it using the Selberg sieve (
Exercise 15.4.6).