In this chapter, we first prove Dirichlet's theorem on primes in arithmetic progressions. We then prove the prime number theorem in arithmetic progressions, modulo some details left as exercises.
Section5.1Dirichlet's theorem
Definition5.1.
For short, let us call an arithmetic progression eligible if it has the form \(m, m + N, m + 2N, \dots\) where \(\gcd(m,N) = 1\text{.}\) It is equivalent to ask that some (and hence any) two consecutive terms are relatively prime.
Theorem5.2.Dirichlet.
Any eligible arithmetic progression of positive integers contains infinitely many primes.
While there are a few special cases that can be proved directly (e.g., see Exercise 5.5.1), the general statement seems to be imperivous to purely algebraic methods. Dirichlet's idea is to instead adapt Euler's proof of the infinitude of primes using the Riemann zeta function (Section 2.1), but adapting the construction to pick out the primes in a particular arithmetic progression. See Theorem 5.11 for the complete argument.
Section5.2Asymptotic density and Dirichlet density
As noted above, to prove Theorem 5.2 we will need a tool that will let us measure how many of the primes belong to a fixed arithmetic progression. For this, we need some sort of measure theory on the set of primes, or more generally on infinite sets of integers. Note that Lebesgue-type measure theory is not an option for countable sets: we can only hope to make finitely additive measures.
Definition5.3.
For \(S \subseteq T\) two sets of positive integers, with \(T\) infinite, the upper natural density and lower natural density of \(S\) in \(T\) are defined as
\begin{equation*}
\limsup_{N \to \infty} \frac{\#\{n \in S: n \leq N \}}{\#\{n \in T: n \leq N\}},
\qquad
\liminf_{N \to \infty} \frac{\#\{n \in S: n \leq N \}}{\#\{n \in T: n \leq N\}}.
\end{equation*}
Of course the upper density is never less than the lower density. If they coincide, we call the common value the natural density (or asymptotic density) of \(S\) in \(T\text{.}\)
Many interesting sets fail to have a natural density (e.g., see Exercise 5.5.3). We get a less restrictive notion of density by using Dirichlet series.
Definition5.4.
For \(S \subseteq T\) two sets of positive integers, with \(\sum_{n \in T} n^{-1}\) divergent, the upper Dirichlet density and lower Dirichlet density of \(S\) in \(T\) are defined as
One might worry that natural and Dirichlet densities could both exist and give conflicting numerical values for a single set, but fortunately this cannot happen.
Lemma5.7.
Let \(S \subseteq T\) be subsets of \(\NN\) such that \(S\) has natural density \(\delta\) in \(T\text{.}\) Then \(S\) also has Dirichlet density \(\delta\) in \(T\text{.}\) (The converse fails: Exercise 5.5.3 gives an example where \(S\) has a Dirichlet density but no natural density.)
We now formulate some of the measure-theoretic properties of density, leaving details to the reader.
Lemma5.8.
The following statements hold with “density” interpreted (consistently) either as natural density or Dirichlet density.
Any finite set has density 0 in any infinite set.
Density is a finitely additive measure: if \(S_1, \dots, S_m\) are disjoint subsets of \(T\) with densities \(\delta_1, \dots, \delta_m\) in \(T\text{,}\) then their union has density \(\delta_1 + \cdots + \delta_m\) in \(T\text{.}\) Corollary: two subsets of \(T\) whose combined density exceeds 1 must have infinite intersection.
If \(S\) has density \(\delta\) in \(\NN\text{,}\) then for any positive integer \(n\text{,}\)\(nS = \{ns: s \in S\}\) has density \(\delta/n\text{.}\)
The following example will not be used later but provides a “natural” example of natural densities in elementary number theory.
Example5.9.
Let \(\alpha,\beta\) be positive irrational numbers with \(1/\alpha + 1/\beta = 1\text{.}\) Put
\begin{align*}
S_{\alpha} \amp = \{\lfloor n \alpha \rfloor: n \in \NN\}\\
S_{\beta} \amp = \{\lfloor n \beta \rfloor: n \in \NN\}.
\end{align*}
Then \(S_{\alpha}, S_{\beta}\) have natural densities \(1/\alpha, 1/\beta\text{.}\) The fact that these add up to 1 is explained by a beautiful result known as Beatty's theorem (see Exercise 5.5.5): under these conditions, \(S_{\alpha}, S_{\beta}\) are disjoint and their union is \(\NN\text{.}\)
Section5.3\(L\)-functions and discrete Fourier analysis
We return to the task of proving Theorem 5.2 by adapting the proof of the prime number thorem but using Dirichlet \(L\)-functions in place of the Riemann zeta function. This will require a bit of extra work to extract the desired information about primes in a particular arithmetic progression; to begin with, let us fix a modulus \(N\) and concentrate on sorting the primes into residue classes modulo \(N\text{.}\)
For \(\chi\) a Dirichlet character of level \(N\text{,}\) we can write
Our next step is to form a certain linear combination of the \(\log L(s, \chi)\) to isolate \(\sum_{p \equiv m\,(N)} p^{-s}\text{,}\) and compare the asymptotic contributions of \(-\log(s-1)\text{.}\) The fact that we can do this amounts to what is sometimes called discrete Fourier analysis; if you prefer, it is the representation theory of the finite abelian group \((\ZZ/N\ZZ)^\times\text{.}\)
Theorem5.10.Discrete abelian Fourier analysis.
Let \(G\) be a finite abelian group, and let \(\hat{G}\) be the character group (or dual group) of \(G\text{,}\) i.e., the group of homomorphisms \(G \to \CC^*\text{.}\)
The order of \(\hat{G}\) is equal to the order of \(G\text{.}\)
(Orthogonality of characters) If \(\chi_1, \chi_2 \in \hat{G}\text{,}\) then
If \(G = G_1 \times G_2\text{,}\) then \(\hat{G} = \hat{G_1} \times \hat{G_2}\text{.}\) Since every finite abelian group \(G\) is a product of cyclic groups, we may reduce to the case where \(G\) is cyclic, and then the result is clear. (For \(G = (\ZZ/N\ZZ)^\times\text{,}\) we can make this more explicit: we can use the Chinese remainder theorem to split \(N\) into distinct prime-power factors, then use the fact that \((\ZZ/p^n\ZZ)^\times\) is cyclic unless \(p=2\text{,}\) in which case it is \(\{\pm 1\}\) times a cyclic group.)
We saw this argument once before, but here it goes again: the left side is invariant under multiplication by \(\chi_1(h) \overline{\chi_2(h)}\) for any \(h \in G\text{,}\) because there is no difference between summing over \(g\) or over \(gh\text{.}\) If \(\chi_1 \neq \chi_2\text{,}\) then we can make that multiplier different from 1 by choosing suitable \(h\text{.}\) So the sum vanishes if \(\chi_1 \neq \chi_2\text{.}\) If \(\chi_1 = \chi_2\text{,}\) each summand is equal to 1 because characters of finite groups takes values which are roots of unity.
At last, we are ready to prove a result that will imply Theorem 5.2.
Theorem5.11.
For any positive integers \(m,N\) with \(\gcd(m,N) = 1\text{,}\) the set of primes congruent to \(m\) modulo \(N\) has Dirichlet density \(1/\phi(N)\) in the set of all primes (hence is infinite).
Section5.4The prime number theorem in arithmetic progressions
The proof of Dirichlet's theorem only uses information about the behavior of \(L(s,\chi)\) near \(s=1\text{.}\) Using the fact that \(L(s,\chi) \neq 0\) on the entire line \(\Real(s) = 1\text{,}\) we can prove a much stronger result.
Theorem5.12.Prime number theorem in arithmetic progressions.
For \(m,N\) positive integers with \(\gcd(m,N)\text{,}\) the set of primes congruent to \(m\) modulo \(N\) has natural density \(1/\phi(N)\text{.}\) In other words, the number of primes \(p \leq x\) with \(p \equiv m \pmod{N}\) is asymptotic to \(\frac{1}{\phi(N)} \frac{x}{\log x}\) as \(x \to \infty\text{.}\)
Given what we now know, this is a straightforward adaptation of our proof of the prime number theorem (Chapter 2). We thus skip over some details which are not much changed, in order to focus on the new ingredients.
For \(\chi\) a Dirichlet character of level \(N\text{,}\) define
\begin{equation*}
\vartheta_\chi(x) := \sum_{p \leq x} \chi(p) \log p.
\end{equation*}
Given a choice of \(m\) coprime to \(N\text{,}\) put
\begin{align*}
\vartheta_m(x) \amp = \frac{1}{\phi(N)} \sum_\chi \overline{\chi(m)} \vartheta_\chi(x)\\
\amp = \sum_{p \leq x: p \equiv m\,(N)} \log p
\end{align*}
where the last equality follows from Theorem 5.10. As in Lemma 2.7, the desired result is equivalent to \(\vartheta_m(x) \sim \frac{1}{\phi(N)} x\text{.}\)
As in Section 2.3, it will suffice to show convergence of the improper integral
in which the sum runs over the Dirichlet characters of level \(N\text{.}\) For \(\chi\) principal, \(L(s,\chi)\) differs from \(\zeta(s)\) by finitely many Euler factors, none of which has a zero or pole at \(s=1\text{;}\) hence \(-L'(s,\chi)/L(s,\chi)\) again has a simple pole at \(s=1\) with residue 1 and no other poles in \(\Real(s) \geq 1\text{.}\) Meanwhile, for \(\chi\) nonprincipal, \(L(s,\chi)\) is actually holomorphic for \(\Real(s) \geq 1\) (by Theorem 4.5) with no zeroes in this region (by Theorem 4.8, Theorem 4.10, and Theorem 4.11); hence \(-L'(s,\chi)/L(s,\chi)\) is holomorphic for \(\Real(s) \geq 1\text{.}\)
We deduce from the above that \(\Phi(s+1) - \frac{1}{s} \) extends holomorphically at \(s=0\text{.}\) Now applying Theorem 2.8 as in Section 2.4, we deduce that the value of this function at \(s=0\) computes \(\int_0^\infty (\vartheta_m(e^t)e^{-t} - 1)\,dt\) plus an integral which converges absolutely (corresponding to the summands with \(n>1\) in (5.4.2)). Subsituting \(x = e^t\) yields (5.4.1) as required.
Echoing Remark 2.9, the proof of Theorem 5.12 yields very little information about the error term, i.e., the difference between the actual number of primes \(p \leq x\) with \(p \equiv m \pmod{N}\) and the asympotic count \(\frac{1}{\phi(N)} \frac{x}{\log x}\text{.}\) That becomes a problem if, for instance, we want to know how far we have to go into an arithmetic progression to find a single prime. To address this, we must first get better results on zero-free regions for the functions \(L(s,\chi)\text{,}\) then make a more careful analytic argument to take advantage of this additional information. This will be the focus of the second part of the course.
Exercises5.5Exercises
1.
Prove directly (without using Dirichlet's theorem) that there are infinitely many primes congruent to 3 mod 4.
Form the matrix \(A\) with rows indexed by \(g \in G\text{,}\) columns indexed by \(\chi \in \hat{G}\text{,}\) and entries \(\chi(g)\text{.}\) Then compare \(A A^*\) with \(A^* A\text{,}\) where \(*\) denotes conjugate transpose. Or if you prefer, prove that the dual of \(\hat{G}\) is canonically isomorphic to \(G\text{.}\)