Skip to main content

Chapter 5 Primes in arithmetic progressions

In this chapter, we first prove Dirichlet's theorem on primes in arithmetic progressions. We then prove the prime number theorem in arithmetic progressions, modulo some details left as exercises.

Section 5.1 Dirichlet's theorem

Definition 5.1.

For short, let us call an arithmetic progression eligible if it has the form \(m, m + N, m + 2N, \dots\) where \(\gcd(m,N) = 1\text{.}\) It is equivalent to ask that some (and hence any) two consecutive terms are relatively prime.
While there are a few special cases that can be proved directly (e.g., see Exercise 5.5.1), the general statement seems to be imperivous to purely algebraic methods. Dirichlet's idea is to instead adapt Euler's proof of the infinitude of primes using the Riemann zeta function (Section 2.1), but adapting the construction to pick out the primes in a particular arithmetic progression. See Theorem 5.11 for the complete argument.

Section 5.2 Asymptotic density and Dirichlet density

As noted above, to prove Theorem 5.2 we will need a tool that will let us measure how many of the primes belong to a fixed arithmetic progression. For this, we need some sort of measure theory on the set of primes, or more generally on infinite sets of integers. Note that Lebesgue-type measure theory is not an option for countable sets: we can only hope to make finitely additive measures.

Definition 5.3.

For \(S \subseteq T\) two sets of positive integers, with \(T\) infinite, the upper natural density and lower natural density of \(S\) in \(T\) are defined as
\begin{equation*} \limsup_{N \to \infty} \frac{\#\{n \in S: n \leq N \}}{\#\{n \in T: n \leq N\}}, \qquad \liminf_{N \to \infty} \frac{\#\{n \in S: n \leq N \}}{\#\{n \in T: n \leq N\}}. \end{equation*}
Of course the upper density is never less than the lower density. If they coincide, we call the common value the natural density (or asymptotic density) of \(S\) in \(T\text{.}\)
Many interesting sets fail to have a natural density (e.g., see Exercise 5.5.3). We get a less restrictive notion of density by using Dirichlet series.

Definition 5.4.

For \(S \subseteq T\) two sets of positive integers, with \(\sum_{n \in T} n^{-1}\) divergent, the upper Dirichlet density and lower Dirichlet density of \(S\) in \(T\) are defined as
\begin{equation*} \limsup_{s \to 1^+} \frac{\sum_{n \in S} n^{-s}}{\sum_{n \in T} n^{-s}}, \qquad \liminf_{s \to 1^+} \frac{\sum_{n \in S} n^{-s}}{\sum_{n \in T} n^{-s}}. \end{equation*}
If they coincide, we call the common value the Dirichlet density of \(S\) in \(T\text{.}\)
Let us make this explicit in two cases of interest.

Example 5.5.

Recall that \(\zeta(s)\) has a simple pole of residue 1 at \(s=1\text{,}\) so as \(s \to 1^+\text{,}\)
\begin{equation*} (s-1) \sum_{n=1}^\infty n^{-s} = 1 + o(1). \end{equation*}
Hence if \(T = \NN\) then the Dirichlet density of \(S\) is given by
\begin{equation*} \lim_{s \to 1^+} (s-1) \sum_{n \in S} n^{-s} \end{equation*}
if the limit exists.

Example 5.6.

Taking logarithms in Example 5.5, we see that as \(s \to 1^+\text{,}\)
\begin{equation*} \log \zeta(s) + \log (s-1) = \log(s-1) + \sum_p \sum_{n=1}^\infty p^{-ns} = O(1). \end{equation*}
Moreover, \(\sum_p \sum_{n=2}^\infty p^{-ns} = O(1)\text{,}\) so
\begin{equation*} \sum_p p^{-s} = -\log (s-1) + O(1). \end{equation*}
Hence if \(T\) is the set of primes, then the Dirichlet density of \(S\) is given by
\begin{equation*} \lim_{s \to 1^+} \frac{\sum_{p \in S} p^{-s}}{-\log(s-1)} \end{equation*}
if the limit exists.
One might worry that natural and Dirichlet densities could both exist and give conflicting numerical values for a single set, but fortunately this cannot happen.
We now formulate some of the measure-theoretic properties of density, leaving details to the reader.
The following example will not be used later but provides a “natural” example of natural densities in elementary number theory.

Example 5.9.

Let \(\alpha,\beta\) be positive irrational numbers with \(1/\alpha + 1/\beta = 1\text{.}\) Put
\begin{align*} S_{\alpha} \amp = \{\lfloor n \alpha \rfloor: n \in \NN\}\\ S_{\beta} \amp = \{\lfloor n \beta \rfloor: n \in \NN\}. \end{align*}
Then \(S_{\alpha}, S_{\beta}\) have natural densities \(1/\alpha, 1/\beta\text{.}\) The fact that these add up to 1 is explained by a beautiful result known as Beatty's theorem (see Exercise 5.5.5): under these conditions, \(S_{\alpha}, S_{\beta}\) are disjoint and their union is \(\NN\text{.}\)

Section 5.3 \(L\)-functions and discrete Fourier analysis

We return to the task of proving Theorem 5.2 by adapting the proof of the prime number thorem but using Dirichlet \(L\)-functions in place of the Riemann zeta function. This will require a bit of extra work to extract the desired information about primes in a particular arithmetic progression; to begin with, let us fix a modulus \(N\) and concentrate on sorting the primes into residue classes modulo \(N\text{.}\)
For \(\chi\) a Dirichlet character of level \(N\text{,}\) we can write
\begin{equation} \log L(s,\chi) = \sum_p \sum_{n=1}^\infty \chi(p^n) p^{-ns}\tag{5.3.1} \end{equation}
for \(\Real(s) > 1\text{;}\) as \(s \to 1^+\text{,}\) we have
\begin{equation} \log L(s, \chi) = \sum_p \chi(p) p^{-s} + O(1).\tag{5.3.2} \end{equation}
For \(\chi\) nonprincipal, we know that \(L(s,\chi)\) is holomorphic and nonvanishing at \(s=1\text{,}\) so
\begin{equation*} \sum_p \chi(p) p^{-s} = O(1), \end{equation*}
whereas for \(\chi_0\) the principal conductor of level \(N\text{,}\) we saw above that
\begin{equation*} \sum_p \chi_0(p) p^{-s} = -\log(s-1) + O(1). \end{equation*}
Our next step is to form a certain linear combination of the \(\log L(s, \chi)\) to isolate \(\sum_{p \equiv m\,(N)} p^{-s}\text{,}\) and compare the asymptotic contributions of \(-\log(s-1)\text{.}\) The fact that we can do this amounts to what is sometimes called discrete Fourier analysis; if you prefer, it is the representation theory of the finite abelian group \((\ZZ/N\ZZ)^\times\text{.}\)
  1. If \(G = G_1 \times G_2\text{,}\) then \(\hat{G} = \hat{G_1} \times \hat{G_2}\text{.}\) Since every finite abelian group \(G\) is a product of cyclic groups, we may reduce to the case where \(G\) is cyclic, and then the result is clear. (For \(G = (\ZZ/N\ZZ)^\times\text{,}\) we can make this more explicit: we can use the Chinese remainder theorem to split \(N\) into distinct prime-power factors, then use the fact that \((\ZZ/p^n\ZZ)^\times\) is cyclic unless \(p=2\text{,}\) in which case it is \(\{\pm 1\}\) times a cyclic group.)
  2. We saw this argument once before, but here it goes again: the left side is invariant under multiplication by \(\chi_1(h) \overline{\chi_2(h)}\) for any \(h \in G\text{,}\) because there is no difference between summing over \(g\) or over \(gh\text{.}\) If \(\chi_1 \neq \chi_2\text{,}\) then we can make that multiplier different from 1 by choosing suitable \(h\text{.}\) So the sum vanishes if \(\chi_1 \neq \chi_2\text{.}\) If \(\chi_1 = \chi_2\text{,}\) each summand is equal to 1 because characters of finite groups takes values which are roots of unity.
At last, we are ready to prove a result that will imply Theorem 5.2.
Summing over all Dirichlet characters \(\chi\) of level \(N\) and applying Theorem 5.10, we obtain
\begin{align*} \frac{1}{\phi(N)} \sum_{\chi} \overline{\chi(m)} \log L(s,\chi) \amp = \frac{1}{\phi(N)} \sum_{\chi} \sum_p \chi(p) \overline{\chi(m)} p^{-s} + O(1)\\ \amp = \sum_{p \equiv m\,(N)} p^{-s} + O(1) \end{align*}
as \(s \to 1^+\text{.}\) On the other hand,
\begin{align*} \frac{1}{\phi(N)} \sum_{\chi} \overline{\chi(m)} \log L(s,\chi) \amp= \frac{1}{\phi(N)} \log L(s,\chi_0) + \frac{1}{\phi(N)} \sum_{\chi \neq \chi_0} \overline{\chi(m)} \log L(s,\chi)\\ \amp= - \frac{1}{\phi(N)} \log(s-1) + O(1). \end{align*}

Section 5.4 The prime number theorem in arithmetic progressions

The proof of Dirichlet's theorem only uses information about the behavior of \(L(s,\chi)\) near \(s=1\text{.}\) Using the fact that \(L(s,\chi) \neq 0\) on the entire line \(\Real(s) = 1\text{,}\) we can prove a much stronger result.
Given what we now know, this is a straightforward adaptation of our proof of the prime number theorem (Chapter 2). We thus skip over some details which are not much changed, in order to focus on the new ingredients.
For \(\chi\) a Dirichlet character of level \(N\text{,}\) define
\begin{equation*} \vartheta_\chi(x) := \sum_{p \leq x} \chi(p) \log p. \end{equation*}
Given a choice of \(m\) coprime to \(N\text{,}\) put
\begin{align*} \vartheta_m(x) \amp = \frac{1}{\phi(N)} \sum_\chi \overline{\chi(m)} \vartheta_\chi(x)\\ \amp = \sum_{p \leq x: p \equiv m\,(N)} \log p \end{align*}
where the last equality follows from Theorem 5.10. As in Lemma 2.7, the desired result is equivalent to \(\vartheta_m(x) \sim \frac{1}{\phi(N)} x\text{.}\)
As in Section 2.3, it will suffice to show convergence of the improper integral
\begin{equation} \int_1^\infty \frac{\phi(N) \vartheta_m(x) - x}{x^2}\,dx.\tag{5.4.1} \end{equation}
We will again use the Tauberian argument (Section 2.4) to reduce this to an application of Theorem 2.8. Keeping in mind (5.3.1), define the function
\begin{equation} \Phi(s) = \sum_\chi - \frac{L'(s,\chi)}{L(s,\chi)} = \sum_\chi \sum_p \sum_{n=1}^\infty \chi(p^n) (n \log p) p^{-ns}\tag{5.4.2} \end{equation}
in which the sum runs over the Dirichlet characters of level \(N\text{.}\) For \(\chi\) principal, \(L(s,\chi)\) differs from \(\zeta(s)\) by finitely many Euler factors, none of which has a zero or pole at \(s=1\text{;}\) hence \(-L'(s,\chi)/L(s,\chi)\) again has a simple pole at \(s=1\) with residue 1 and no other poles in \(\Real(s) \geq 1\text{.}\) Meanwhile, for \(\chi\) nonprincipal, \(L(s,\chi)\) is actually holomorphic for \(\Real(s) \geq 1\) (by Theorem 4.5) with no zeroes in this region (by Theorem 4.8, Theorem 4.10, and Theorem 4.11); hence \(-L'(s,\chi)/L(s,\chi)\) is holomorphic for \(\Real(s) \geq 1\text{.}\)
We deduce from the above that \(\Phi(s+1) - \frac{1}{s} \) extends holomorphically at \(s=0\text{.}\) Now applying Theorem 2.8 as in Section 2.4, we deduce that the value of this function at \(s=0\) computes \(\int_0^\infty (\vartheta_m(e^t)e^{-t} - 1)\,dt\) plus an integral which converges absolutely (corresponding to the summands with \(n>1\) in (5.4.2)). Subsituting \(x = e^t\) yields (5.4.1) as required.
Echoing Remark 2.9, the proof of Theorem 5.12 yields very little information about the error term, i.e., the difference between the actual number of primes \(p \leq x\) with \(p \equiv m \pmod{N}\) and the asympotic count \(\frac{1}{\phi(N)} \frac{x}{\log x}\text{.}\) That becomes a problem if, for instance, we want to know how far we have to go into an arithmetic progression to find a single prime. To address this, we must first get better results on zero-free regions for the functions \(L(s,\chi)\text{,}\) then make a more careful analytic argument to take advantage of this additional information. This will be the focus of the second part of the course.

Exercises 5.5 Exercises

1.

Prove directly (without using Dirichlet's theorem) that there are infinitely many primes congruent to 3 mod 4.
Hint.
Modify Euclid's proof so that the number which is forced to have a “new” prime factor is itself congruent to 3 mod 4.

3.

Let \(S\) be the set of positive integers which have first digit 1 when written in base 10.
  1. Compute the upper and lower natural density of \(S\text{,}\) and verify that \(S\) does not have a natural density.
  2. Prove that \(S\) has a natural Dirichlet density, and compute it. This density appears in statistics as part of Benford's Law.
  3. Optionally, generalize to an arbitrary base \(b \geq 3\) and/or prove the analogous results for the set of primes with first digit 1.

5.

Prove Beatty's theorem (Example 5.9).
Hint.
Compute \(\#(S_\alpha \cap \{1,\dots,n\})\) and \(\#(S_\beta \cap \{1,\dots,n\})\) and show that they add up to \(n\text{.}\)

6.

Prove that there exists a constant \(c\) such that
\begin{equation*} \sum_{p \leq x} \frac{1}{p} = \log \log x + c + o(1). \end{equation*}
Hint.
Use Exercise 2.6.8 and partial summation.

7. (Mertens).

Prove that the constant \(c\) from Exercise 5.5.6 equals
\begin{equation*} c = \gamma + \sum_p \left( \log (1 - p^{-1}) + p^{-1} \right), \end{equation*}
where \(\gamma\) is the Euler–Mascheroni constant (Exercise 2.6.1). Then deduce that
\begin{equation*} \prod_{p \leq x} (1 - p^{-1}) \sim \frac{e^{-\gamma}}{\log x}. \end{equation*}

8.

Deduce point (c) of Theorem 5.10 from points (a) and (b).
Hint.
Form the matrix \(A\) with rows indexed by \(g \in G\text{,}\) columns indexed by \(\chi \in \hat{G}\text{,}\) and entries \(\chi(g)\text{.}\) Then compare \(A A^*\) with \(A^* A\text{,}\) where \(*\) denotes conjugate transpose. Or if you prefer, prove that the dual of \(\hat{G}\) is canonically isomorphic to \(G\text{.}\)