Skip to main content

Chapter 10 Error bounds for primes in arithmetic progressions

In this chapter, we summarize how to derive a form of the prime number theorem in arithmetic progressions with an appropriate uniformity in the modulus. Many proofs are missing, and will not be included in this course. We will revisit this uniformity later in the Bombieri–Vinogradov theorem.

Section 10.1 von Mangoldt’s formula for \(L\)-functions

We start by formulating the analogue of von Mangoldt’s formula (Theorem 9.9) for a Dirichlet \(L\)-function.

Definition 10.1.

\(\chi\)
\begin{equation*} \psi(x, \chi) := \sum_{n \leq x} \chi(n) \Lambda(n)\text{,} \end{equation*}
\(\Lambda(n)\)Exercise 2.4.6\(n=x\)\(1/2\)

Proof.

For a fixed \(N\text{,}\) one can use this formula together with a zero-free region for all of the \(L(s,\chi)\) with \(\chi\) of level \(N\text{,}\) to obtain a prime number theorem for arithmetic progressions of difference \(N\) with an estimate for the error term. However, one would also like to be able to establish a prime number theorem with error term for arithmetic progressions where the difference is allowed to vary; we study this question next.

Section 10.2 Uniformity in the explicit formula

We now ask the question: to what extent can we use Theorem 10.2 to obtain an estimate for the prime number theorem in arithmetic progressions with some degree of uniformity in the modulus?
To begin with, recall that the proof of Theorem 9.9 included a step where we shifted the value of \(T\) to avoid getting too close to the imaginary part of a zero of \(\zeta\) in the critical strip. A similar step occurs in the proof of Theorem 10.2, so we must make sure that this adjustment can be made uniformly in \(N\text{;}\) this follows from the following lemma.

Proof.

The other remaining issues lie in the shape of the main terms of Theorem 10.2. One of these is the quantity \(b(\chi)\) which had no analogue for \(\zeta\text{.}\) To describe it, we go back to the product expansion:
\begin{equation*} \frac{L'(s,\chi)}{L(s,\chi)} = -\frac{1}{2} \log (N/\pi) - \frac{1}{2} \frac{\Gamma'(s/2 + a/2)}{\Gamma(s/2 + a/2)} + B(\chi) + \sum_{\rho} \left( \frac{1}{s-\rho} + \frac{1}{\rho} \right)\text{,} \end{equation*}
where \(a=0\) if \(\chi\) is even and \(a=1\) if \(\chi\) is odd. The constant \(B\) here is not the same as \(b\) (it includes the contribution from the exponential part of the Hadamard product expansion), but no matter; we can eliminate it by comparing a given \(s\) with \(s=2\text{.}\) Hence
\begin{equation*} \frac{L'(s,\chi)}{L(s,\chi)} = O(1) - \frac{1}{2} \frac{\Gamma'(s/2 + a/2)}{\Gamma(s/2 + a/2)} + \sum_\rho \left( \frac{1}{s-\rho} - \frac{1}{2-\rho} \right)\text{,} \end{equation*}
where the implied constant in \(O(1)\) is absolute (independent of \(N\)). If \(a=1\text{,}\) everything is holomorphic near \(s=0\text{;}\) if \(a=0\text{,}\) the two log derivatives both have a simple pole at \(s=0\text{,}\) and the residues match. We can thus equate the constant terms of the expansions around \(s=0\text{,}\) to obtain
\begin{equation*} b(\chi) = O(1) - \sum_\rho \left( \frac{1}{\rho} + \frac{1}{2-\rho} \right) \end{equation*}
As happened with \(\zeta\text{,}\) we can bound the contribution of the zeroes with \(|\Imag(\rho)| \geq 1\) by \(O(\log N)\text{.}\) The same goes for the term \(1/(2-\rho)\) when \(|\Imag(\rho)| \leq 1\text{.}\)
Putting this together, we now have
\begin{equation} \psi(x, \chi) = -\sum_{\rho\colon |\Imag(\rho)| \lt T} \frac{x^\rho}{\rho} + \sum_{\rho\colon |\Imag(\rho)| \lt 1} \frac{1}{\rho} + O(x T^{-1} \log^2 (Nx))\text{.}\tag{10.2.1} \end{equation}

Section 10.3 The generalized Riemann hypothesis

As with \(\zeta\text{,}\) numerical evidence supports a very strong conjecture on the location of the zeroes of \(L(s,\chi)\text{.}\)
Under Conjecture 10.4, (10.2.1) implies the estimate
\begin{equation} \psi(x,\chi) = O(x^{1/2+\epsilon}), \epsilon \gt 0\tag{10.3.1} \end{equation}
and a corresponding estimate for the prime number theorem in arithmetic progressions.

Remark 10.5.

Since the generalized Riemann hypothesis is both empirically evident and (apparently) theoretically intractable, it has become quite common to prove theorems that are conditional on it. However, when handling such theorems, one must be careful to understand exactly which \(L\)-functions are being included in the “generalized Riemann hypothesis”, as in some cases these go beyond the Dirichlet \(L\)-functions. For instance, one can formulate (and test numerically) a generalized Riemann hypothesis for some of the more exotic \(L\)-functions we will consider at the end of the course, like Artin nonabelian \(L\)-functions (Chapter 22) and elliptic curve \(L\)-functions (Chapter 23).

Section 10.4 Zero-free regions for \(\chi\)

Returning to the unconditional realm (i.e., not assuming Conjecture 10.4, we need to describe a zero-free region for \(\chi\) with some uniformity. This proceeds similarly to the case for \(\zeta\) if we again ignore zeroes near the real line.

Proof.

Definition 10.7.

A zero of \(L(s,\chi)\) arising as an exception in Theorem 10.6 (necessarily for \(\chi\) a real character) is called an exceptional zero, or Siegel zero, of \(L(s,\chi)\text{.}\) Note that the criterion for being a Siegel zero is not absolute; it depends on the choice of the cutoff parameter \(c\text{.}\) Note also that Siegel zeroes are “cryptids” in the sense that we expect (based on Conjecture 10.4 plus copious numerical evidence) that they do not exist for any \(\chi\text{;}\) nonetheless, for the purposes of proving unconditional theorems it is helpful to have a way to refer to such hypothetical animals.

Remark 10.8.

One can see where the possibility of an exceptional zero arises by beginning to imitate for \(L(s,\chi)\) the proof we gave of the zero-free region for \(\zeta\) (Theorem 8.8). We have
\begin{equation*} -\frac{L'(s,\chi)}{L(s,\chi)} = \sum_{n=1}^\infty \Lambda(n) n^{-\Real(s)} \chi(n) e^{-i \Imag(s) \log n}\text{,} \end{equation*}
and using the trigonometric inequality, we have for \(\sigma > 1\)
\begin{equation*} -3 \frac{L'(\sigma,\chi_0)}{L(\sigma,\chi_0)} - 4 \Real \frac{L'(\sigma + it, \chi)}{L(\sigma + it,\chi)} - \Real \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi)} \geq 0\text{.} \end{equation*}
Here \(\chi_0\) is the principal character of the same level as \(\chi\text{.}\)
The argument to exclude zeroes close to the edge of the critical strip proceeds as before if \(\Imag(\rho)\) is bounded away from 0, say \(|\Imag(\rho)| > c/(\log N)\text{.}\) For \(\chi\) nonreal, you do better: \(\chi^2\) is nonprincipal and so \(L(\sigma+2it, \chi^2)\) stays bounded as \(\sigma \to 1^+\text{.}\) So you get an inequality of the form
\begin{equation*} \frac{4}{\sigma - \Imag(\rho)} \lt \frac{3}{\sigma - 1} + O(\log N + \log (|\Imag(\rho)| + 1))\text{,} \end{equation*}
and that gives you a zero-free region all the way down to the real line.
Unfortunately, if \(\chi\) is real, then \(L(\sigma+2it, \chi^2)\) blows up at \(\chi = 1\text{,}\) and our present methods cannot exclude a single zero very close to 1: we only end up with an inequality of the form
\begin{equation*} \frac{4}{\sigma - \Imag(\rho)} \lt \frac{3}{\sigma - 1} + \Real \left( \frac{1}{\sigma-1+2i\Imag(\rho)} \right) + O(\log N + \log (|\Imag(\rho)| + 1))\text{.} \end{equation*}
However, we can exclude two such zeroes \(\rho_1, \rho_2\text{,}\) by writing
\begin{equation*} -\frac{L'(s,\chi)}{L(s,\chi)} \lt - \frac{1}{\sigma - \rho_1} - \frac{1}{\sigma - \rho_2} + O(\log N + \log (|\Imag(\rho)| + 1)) \end{equation*}
and so on.

Section 10.5 Controlling the exceptional zeroes

Returning to (10.2.1), suppose for the sake of argument that \(\chi\) is real and \(L(s,\chi)\) does admit a Siegel zero \(\beta\text{.}\) By the functional equation, \(1-\beta\) is also a zero of \(L(s,\chi)\text{.}\) The sum of \(1/\rho\) over the remaining zeroes in the range \(|\Imag(\rho)| \lt 1\) is \(O((\log N)^2)\text{,}\) since there are \(O(\log N)\) such zeroes and each term contributes \(O(\log N)\) to the sum. We then have
\begin{equation*} \psi(x,\chi) = -\sum_{\rho\colon |\Imag(\rho)| \lt T}^{\sim} \frac{x^\rho}{\rho} - \frac{x^{\beta}}{\beta} - \frac{x^{1-\beta}-1}{1-\beta} + O(x T^{-1} \log^2 (Nx))\text{,} \end{equation*}
where the tilde indicates that we are not counting \(\beta\) and \(1-\beta\) among the zeroes.
The term \((x^{1-\beta}-1)/(1-\beta)\) is \(O(x^c \log x)\text{,}\) but controlling the term \(x^\beta/\beta\) requires preventing the exceptional zero from getting too close to \(1\text{.}\) Here’s one way to do that.

Proof.

The proof of this uses Theorem 10.6; the idea is to show that if you have an exceptional zero for one real character, it “repels” real zeroes for other characters. See [3], section 21.
This is enough to get the following form of the prime number theorem in arithmetic progressions with error term, called the Siegel–Walfisz theorem.

Definition 10.10.

\(N\)\(a\)\(N\text{,}\)
\begin{align*} \pi(x, N, a) \amp:= \sum_{p \leq x, p \equiv a (N)} 1\\ \psi(x, N, a) \amp:= \sum_{n \leq x, n \equiv a (N)} \Lambda(n)\text{.} \end{align*}
The statement Theorem 10.11 only has content for \(N\) no bigger than a fixed power of \(\log x\text{.}\) Even without Conjecture 10.4, we can prove much better results, say for \(N\) up to \(x^C\) for a fixed \(c \lt 1/2\text{,}\) when we aggregate the error bounds over a range of values of \(N\text{.}\) More on this when we discuss the Bombieri–Vinogradov theorem (Theorem 17.1).