Skip to main content

Chapter 11 Error bounds for primes in arithmetic progressions

In this chapter, we summarize how to derive a form of the prime number theorem in arithmetic progressions with an appropriate uniformity in the modulus. Many proofs are missing, and will not be included in this course. We will revisit this uniformity later in the Bombieri–Vinogradov theorem.

Section 11.1 von Mangoldt's formula for \(L\)-functions

We start by formulating the analogue of von Mangoldt's formula (Theorem 10.9) for a Dirichlet \(L\)-function.

Definition 11.1.

For \(\chi\) a Dirichlet character, define
\begin{equation*} \psi(x, \chi) = \sum_{n \leq x} \chi(n) \Lambda(n), \end{equation*}
where \(\Lambda(n)\) is the von Mangoldt function (Exercise 3.4.6) and again we multiply the \(n=x\) term by \(1/2\) if it is present.
See [3], section 19.
For a fixed \(N\text{,}\) one can use this formula together with a zero-free region for all of the \(L(s,\chi)\) with \(\chi\) of level \(N\text{,}\) to obtain a prime number theorem for arithmetic progressions of difference \(N\) with an estimate for the error term. However, one would also like to be able to establish a prime number theorem with error term for arithmetic progressions where the difference is allowed to vary; we study this question next.

Section 11.2 Uniformity in the explicit formula

We now ask the question: to what extent can we use Theorem 11.2 to obtain an estimate for the prime number theorem in arithmetic progressions with some degree of uniformity in the modulus?
To begin with, recall that the proof of Theorem 10.9 included a step where we shifted the value of \(T\) to avoid getting too close to the imaginary part of a zero of \(\zeta\) in the critical strip. A similar step occurs in the proof of Theorem 11.2, so we must make sure that this adjustment can be made uniformly in \(N\text{;}\) this follows from the following lemma.
See [3], section 16.
The other remaining issues lie in the shape of the main terms of Theorem 11.2. One of these is the quantity \(b(\chi)\) which had no analogue for \(\zeta\text{.}\) To describe it, we go back to the product expansion:
\begin{equation*} \frac{L'(s,\chi)}{L(s,\chi)} = -\frac{1}{2} \log (N/\pi) - \frac{1}{2} \frac{\Gamma'(s/2 + a/2)}{\Gamma(s/2 + a/2)} + B(\chi) + \sum_{\rho} \left( \frac{1}{s-\rho} + \frac{1}{\rho} \right), \end{equation*}
where \(a=0\) if \(\chi\) is even and \(a=1\) if \(\chi\) is odd. The constant \(B\) here is not the same as \(b\) (it includes the contribution from the exponential part of the Hadamard product expansion), but no matter; we can eliminate it by comparing a given \(s\) with \(s=2\text{.}\) Hence
\begin{equation*} \frac{L'(s,\chi)}{L(s,\chi)} = O(1) - \frac{1}{2} \frac{\Gamma'(s/2 + a/2)}{\Gamma(s/2 + a/2)} + \sum_\rho \left( \frac{1}{s-\rho} - \frac{1}{2-\rho} \right), \end{equation*}
where the implied constant in \(O(1)\) is absolute (independent of \(N\)). If \(a=1\text{,}\) everything is holomorphic near \(s=0\text{;}\) if \(a=0\text{,}\) the two log derivatives both have a simple pole at \(s=0\text{,}\) and the residues match. We can thus equate the constant terms of the expansions around \(s=0\text{,}\) to obtain
\begin{equation*} b(\chi) = O(1) - \sum_\rho \left( \frac{1}{\rho} + \frac{1}{2-\rho} \right). \end{equation*}
As happened with \(\zeta\text{,}\) we can bound the contribution of the zeroes with \(|\Imag(\rho)| \geq 1\) by \(O(\log N)\text{.}\) The same goes for the term \(1/(2-\rho)\) when \(|\Imag(\rho)| \leq 1\text{.}\)
Putting this together, we now have
\begin{equation} \psi(x, \chi) = -\sum_{\rho: |\Imag(\rho)| \lt T} \frac{x^\rho}{\rho} + \sum_{\rho: |\Imag(\rho)| \lt 1} \frac{1}{\rho} + O(x T^{-1} \log^2 (Nx)).\tag{11.2.1} \end{equation}

Section 11.3 The generalized Riemann hypothesis

As with \(\zeta\text{,}\) numerical evidence supports a very strong conjecture on the location of the zeroes of \(L(s,\chi)\text{.}\)
Under Conjecture 11.4, (11.2.1) implies the estimate
\begin{equation} \psi(x,\chi) = O(x^{1/2+\epsilon}), \epsilon > 0\tag{11.3.1} \end{equation}
and a corresponding estimate for the prime number theorem in arithmetic progressions.

Remark 11.5.

Since the generalized Riemann hypothesis is both empirically evident and (apparently) theoretically intractable, it has become quite common to prove theorems that are conditional on it. However, when handling such theorems, one must be careful to understand exactly which \(L\)-functions are being included in the “generalized Riemann hypothesis”, as in some cases these go beyond the Dirichlet \(L\)-functions. For instance, one can formulate (and test numerically) a generalized Riemann hypothesis for some of the more exotic \(L\)-functions we will consider at the end of the course, like Artin nonabelian \(L\)-functions (Chapter 23) and elliptic curve \(L\)-functions (Chapter 24).

Section 11.4 Zero-free regions for \(\chi\)

Returning to the unconditional realm (i.e., not assuming Conjecture 11.4, we need to describe a zero-free region for \(\chi\) with some uniformity. This proceeds similarly to the case for \(\zeta\) if we again ignore zeroes near the real line.
We give a sketch in Remark 11.8. For a full argument, see [3], section 14.

Definition 11.7.

A zero of \(L(s,\chi)\) arising as an exception in Theorem 11.6 (necessarily for \(\chi\) a real character) is called an exceptional zero, or Siegel zero, of \(L(s,\chi)\text{.}\) Note that the criterion for being a Siegel zero is not absolute; it depends on the choice of the cutoff parameter \(c\text{.}\) Note also that Siegel zeroes are “cryptids” in the sense that we expect (based on Conjecture 11.4 plus copious numerical evidence) that they do not exist for any \(\chi\text{;}\) nonetheless, for the purposes of proving unconditional theorems it is helpful to have a way to refer to such hypothetical animals.

Remark 11.8.

One can see where the possibility of an exceptional zero arises by beginning to imitate for \(L(s,\chi)\) the proof we gave of the zero-free region for \(\zeta\) (Theorem 9.8). We have
\begin{equation*} -\frac{L'(s,\chi)}{L(s,\chi)} = \sum_{n=1}^\infty \Lambda(n) n^{-\Real(s)} \chi(n) e^{-i \Imag(s) \log n}, \end{equation*}
and using the trigonometric inequality, we have for \(\sigma > 1\)
\begin{equation*} -3 \frac{L'(\sigma,\chi_0)}{L(\sigma,\chi_0)} - 4 \Real \frac{L'(\sigma + it, \chi)}{L(\sigma + it,\chi)} - \Real \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi)} \geq 0. \end{equation*}
Here \(\chi_0\) is the principal character of the same level as \(\chi\text{.}\)
The argument to exclude zeroes close to the edge of the critical strip proceeds as before if \(\Imag(\rho)\) is bounded away from 0, say \(|\Imag(\rho)| > c/(\log N)\text{.}\) For \(\chi\) nonreal, you do better: \(\chi^2\) is nonprincipal and so \(L(\sigma+2it, \chi^2)\) stays bounded as \(\sigma \to 1^+\text{.}\) So you get an inequality of the form
\begin{equation*} \frac{4}{\sigma - \Imag(\rho)} \lt \frac{3}{\sigma - 1} + O(\log N + \log (|\Imag(\rho)| + 1)), \end{equation*}
and that gives you a zero-free region all the way down to the real line.
Unfortunately, if \(\chi\) is real, then \(L(\sigma+2it, \chi^2)\) blows up at \(\chi = 1\text{,}\) and our present methods cannot exclude a single zero very close to 1: we only end up with an inequality of the form
\begin{equation*} \frac{4}{\sigma - \Imag(\rho)} \lt \frac{3}{\sigma - 1} + \Real \left( \frac{1}{\sigma-1+2i\Imag(\rho)} \right) + O(\log N + \log (|\Imag(\rho)| + 1)). \end{equation*}
However, we can exclude two such zeroes \(\rho_1, \rho_2\text{,}\) by writing
\begin{equation*} -\frac{L'(s,\chi)}{L(s,\chi)} \lt - \frac{1}{\sigma - \rho_1} - \frac{1}{\sigma - \rho_2} + O(\log N + \log (|\Imag(\rho)| + 1)) \end{equation*}
and so on.

Section 11.5 Controlling the exceptional zeroes

Returning to (11.2.1), suppose for the sake of argument that \(\chi\) is real and \(L(s,\chi)\) does admit a Siegel zero \(\beta\text{.}\) By the functional equation, \(1-\beta\) is also a zero of \(L(s,\chi)\text{.}\) The sum of \(1/\rho\) over the remaining zeroes in the range \(|\Imag(\rho)| \lt 1\) is \(O((\log N)^2)\text{,}\) since there are \(O(\log N)\) such zeroes and each term contributes \(O(\log N)\) to the sum. We then have
\begin{equation*} \psi(x,\chi) = -\sum_{\rho: |\Imag(\rho)| \lt T}^{\sim} \frac{x^\rho}{\rho} - \frac{x^{\beta}}{\beta} - \frac{x^{1-\beta}-1}{1-\beta} + O(x T^{-1} \log^2 (Nx)), \end{equation*}
where the tilde indicates that we are not counting \(\beta\) and \(1-\beta\) among the zeroes.
The term \((x^{1-\beta}-1)/(1-\beta)\) is \(O(x^c \log x)\text{,}\) but controlling the term \(x^\beta/\beta\) requires preventing the exceptional zero from getting too close to \(1\text{.}\) Here's one way to do that.
The proof of this uses Theorem 11.6; the idea is to show that if you have an exceptional zero for one real character, it “repels” real zeroes for other characters. See [3], section 21.
This is enough to get the following form of the prime number theorem in arithmetic progressions with error term, called the Siegel–Walfisz theorem.

Definition 11.10.

For \(N\) a positive integer and \(a\) an integer coprime to \(N\text{,}\) put
\begin{align*} \pi(x, N, a) \amp:= \sum_{p \leq x, p \equiv a (N)} 1\\ \psi(x, N, a) \amp:= \sum_{n \leq x, n \equiv a (N)} \Lambda(n). \end{align*}
The statement Theorem 11.11 only has content for \(N\) no bigger than a fixed power of \(\log x\text{.}\) Even without Conjecture 11.4, we can prove much better results, say for \(N\) up to \(x^C\) for a fixed \(c \lt 1/2\text{,}\) when we aggregate the error bounds over a range of values of \(N\text{.}\) More on this when we discuss the Bombieri–Vinogradov theorem (Theorem 18.1).