Causal versus random: the Tate conjecture and equidistribution

Chapter 18 Causal versus random: the Tate conjecture and equidistribution

Original lecture date: December 9, 2019.

In this last lecture, we talk about two different-sounding things that we'll see are actually related: the Tate conjecture, and distribution problems. Roughly speaking, we'll see that the Tate conjecture is a question about the “causal” factors of zeta functions, whereas distribution problems are questions about “random” factors of zeta functions. The idea is that zeta functions should be made up of these two parts, the first coming from the geometry of the variety, and the second being something that we can (sometimes) show is really random in a suitable sense.

Readings 18.0.1.

Since this lecture covers many disparate topics, suggestions for additional reading have been embedded in the text as appropriate.

Section 18.1 The Hodge conjecture

The Tate conjecture is an analogue of the Hodge conjecture, so we'll start with that.

Definition 18.1.1.

Let \(X/\CC\) be a smooth, proper variety of dimension \(n\text{,}\) and look at the singular cohomology \(H^{i}(X^{\analytic},\CC)\) of the associated complex analytic space. (From now on we'll sloppily write \(X\) for \(X^{\analytic}\text{.}\)) We know that \(H^{i}(X,\QQ)\) injects into \(H^{i}(X,\QQ)\otimes_\QQ \CC = H^i(X, \CC)\text{;}\) that is, the \(\CC\)-vector space contains a lattice which remembers which classes are rational.

We also have the Hodge decomposition

\begin{equation*} H^i(X,\CC)\cong \bigoplus_{p+q=i} H^{p,q}, \qquad H^{p,q}\colonequals H^p(X,\Omega^q). \end{equation*}

Let \(Z\hookrightarrow X\) be a closed irreducible subvariety of pure codimension \(p\text{.}\) Via the cycle class map (i.e., by viewing \(Z\) as representing a homology class and then dualizing), \(Z\) gives rise to a class in \(H^{2p}(X,\QQ)\cap H^{p,p}(X)\text{.}\)

Conjecture 18.1.2. Hodge conjecture.

The intersection \(H^{2p}(X,\QQ)\cap H^{p,p}(X)\) is spanned by classes coming from subvarieties.

Not much is known about the Hodge conjecture.

Theorem 18.1.3. Lefschetz (1,1)-theorem.

The Hodge conjecture holds for \(p=1\text{.}\)

Proof.

See [54], Page 163 for a proof, and a lot of background.

Example 18.1.4.

Let \(A/\CC\) be an abelian variety. By the Lefschetz (1,1)-theorem, \(H^{1,1}(A)\) can be described using endomorphisms of \(A\text{.}\) In particular, if \(A\) has trivial endomorphism ring, then \(H^{1,1}\) is 1-dimensional.

Remark 18.1.5.

Besides the \(p=1\) case, not much is known about the Hodge conjecture. It is far from clear “where to look” for a subvariety corresponding to a particular \((p,p)\)-class.

One case that can be handled is the case \(p=2\) when \(X\) is a K3 surface, which is to say a smooth projective surface such that \(K_X\simeq \calO_X\) and which is simply connected (say, in the sense that every geometrically connected finite etale cover of \(X\) splits).

For a K3 surface, the weight-2 Hodge structure in this case can be embedded into the square of a weight-1 Hodge structure coming from a certain abelian variety, via the Kuga-Satake construction. Using a similar construction, Deligne was able to prove the Weil conjectures for K3 surfaces before coming up with the general proof [29].

Section 18.2 The Tate conjecture

Definition 18.2.1.

Let \(X\) be a smooth proper scheme over a finite field \(k\text{.}\) As in the complex case, for any codimension-\(i\) subvariety \(Z\) on \(X\text{,}\) the cycle class map gives us a class in \(H^{2i}(X)\text{.}\) By the Weil conjectures, this class is a Frobenius eigenvector with eigenvalue \(q^i\text{.}\)

The Tate conjecture is now the natural analogue of the Hodge conjecture.

Conjecture 18.2.2.

The generalized \(q^i\)-eigenspace of \(H^{2i}(X)\) is spanned by cycles coming from codimension-\(i\) subvarieties of \(X\text{.}\)

Remark 18.2.3.

Note that we don't a priori know that Frobenius gives a diagonalizable matrix, so part of the conjecture is a semisimplicity statement. We can rephrase this to get a concrete prediction about zeta functions. In this language, the Tate conjecture says that

\begin{equation*} Z(X,T)=\frac{\cdots}{\cdots\det(1-FT,H^{2i}(X))\cdots} \end{equation*}

has a pole at \(T=q^{-i}\) of order equal to the dimension of the space spanned by the codimension \(i\) cycle classes. So given a variety, we can write down zeta functions explicitly, look at its poles, and get a prediction about cycle classes which would generally be hard to find by hand.

Remark 18.2.4.

The Tate conjecture tends to be as hard as the Hodge conjecture. It's known for \(i=1\) for abelian varieties (this is equivalent to Tate's theorem) and for K3 surfaces by a recent result of Ito–Ito–Koshikawa [68].

The K3 case is already extremely hard and interesting, so let's look a little at it. For \(X\) a K3 surface,

\begin{equation*} Z(X,T)=\frac{1}{(1-T)P(T)(1-q^2T)}. \end{equation*}

Here the outside terms of the denominator come from \(H^0\) and \(H^4\text{,}\) and the inside two come from \(H^2\text{,}\) which is the interesting bit. Renormalizing to make the roots lie on the unit circle, we have \(Q(T):=qP(q^{-1}T)=a_0T^{22}+a_1 T^{21}+\cdots +a_{22}\) with \(a_0 = a_{22} = q\text{,}\) and we are interested in the multiplicity of the factor \((1-T)\) in \(Q\text{.}\)

This renormalization introduces lots of powers of \(q\) in the denominators, so one might expect \(Q\) to no longer be integral. However, there is a crucial piece of information that we have not yet introduced in these lectures.

Proposition 18.2.5.

The coefficients \(a_i\) of \(Q(T)\) are all integers.

Sketch.

This comes from Mazur's “Newton above Hodge” theorem [94]. In this case, the theorem says that the Newton polygon of \(\det(1-FT,H^{2}(X))\) lies above the Hodge polygon, which has integer slopes given by the second row of the Hodge diamond. This row is 1,20,1, so the Hodge polygon has slope 0 for one step, slope 1 for 20 steps, and slope 2 for the final step. Renormalizing and scaling to write this in terms of \(Q\text{,}\) the polygon starts at \((0,1)\text{,}\) goes down to \((1,0)\text{,}\) goes horizontally to \((21,0)\text{,}\) then goes up to \((22,1)\text{.}\) In particular, it never dips below the \(x\)-axis, so the coefficients of the Newton polygon must all have nnnegative \(q\)-adic valuation and therefore will be integers.

Remark 18.2.6.

Continuing with our K3 example, we also have a symmetry property \(a_{22-j}=\pm a_j\) (where this sign is uniform over \(j\)). So we either have actual symmetry, or there's a sign flip after we cross the middle. It is possible but tricky to understand which of these actually happens using the geometry of the K3 surface.

As we are looking at \(H^2\text{,}\) we are in the case where \(i=1\text{,}\) so our cycles are divisors. The well-studied Néron–Severi group \(\NS(X)\) is the group of divisors modulo algebraic equivalence. Rephrasing the conjecture one more time, we are saying that the order of the zero of \(Q\) at \(1\) is equal to the Picard number of \(X\text{,}\) a/k/a the rank of \(\NS(X)\text{.}\) Call this order \(r\text{;}\) it must be an integer between 1 and 22, the 1 because there is automatically a Tate class corresponding to an ample divisor, and the 22 because this is the dimension of \(H^2\) (see Remark 18.2.7). After renormalizing, we're looking for the order of vanishing of \(Q(T)\) at \(T=1\text{.}\) Call this order \(r\text{;}\) we have the Artin–Tate formula (a conjecture in general, but known for K3 surfaces)

\begin{equation*} \frac{Q(T)}{(1-T)^r}|_{T=1}=D\# \Br(X) \end{equation*}

where \(D\) is the discriminant of the Néron–Severi lattice and \(\Br(X)\) is the Brauer group: a finite group with order a perfect square.

This should remind us of the conjecture of Birch–Swinnerton-Dyer. It actually coincides with it in certain cases: when \(X\) is an elliptic K3 surface, the Brauer group coincides with the Tate–Shafarevich group. In general, just like the latter, the Brauer group carries an alternating pairing which forces its order to be a square.

Remark 18.2.7.

In characteristic 0, the Picard number can only go up to 20, but in characteristic \(p\) the value 22 is actually possible! This is similar to the fact that the endomorphism ring of an elliptic curve in characteristic 0 has rank at most 2, but in characteristic \(p\) it can have rank as high as 4.

Remark 18.2.8.

A nice thing here is that if we're handed a K3 surface over a small finite field, we can compute this polynomial, get our hands on \(r\) and make a guess about what the constants should be. For example, if \(r=1\) and \(X\) is a smooth quartic in \(\PP^3\text{,}\) then \(D=4\) and so the right-hand side should be a square. In [79], Kedlaya–Sutherland checked that this is always true over \(\FF_2\text{.}\)

Remark 18.2.9.

This isn't quite the whole story; so far we've been focused on cycles that are defined over \(\FF_q\text{.}\) If we pass to a finite extension, we might get more cycles, and the Tate conjecture would tell us that they should show up in the zeta function as well. For example, if \(1+T\) divides \(Q(T)\text{,}\) then \(-q^{-1}\) is an eigenvalue, and if we base change to \(\FF_{q^2}\text{,}\) we square the eigenvalues and get an eigenvalue of \(q^{-2}\text{.}\) So the Tate conjecture is also saying that eigenvalues of \(q^{-\zeta}\) for any root of unity \(\zeta\) are “causal”: once we base-change, they should also come from cycles. We should therefore be thinking about the factorization of \(Q(T)\) into cyclotomic factors and noncyclotomic factors. The cyclotomic factors tell us about the geometric Néron-Severi rank.

There's a nice heuristic about point counting underlying all of this. Going back to the general Tate conjecture, any codimension-\(i\) subvariety \(Z\) of \(X\) will make some “geometric” contribution to the number of \(\FF_{q^r}\) points on \(X\text{,}\) coming from the \(\FF_{q^r}\) points on \(Z\text{.}\) The (unnormalized) factor of \((1-q^iT)\) in the denominator is just keeping track of this contribution. So if \(X\) has lots of codimension-\(i\) subspaces, we expect it to have more rational points than usual, giving rise to a larger pole in the zeta function. This heuristic seems very similar to the heuristic that led to the Birch–Swinnerton-Dyer conjecture: if an elliptic curve over a number field has high rank, then its reductions modulo primes are forced to have lots of points, which should again lead to a large pole of the \(L\)-function at \(s=1\text{.}\)

Remark 18.2.10.

For a final application in this section, we explain the key idea behind constructing a K3 surface over \(\QQ\) with geometric Picard number 1. This construction is due to van Luijk [125] and answers a question of Mumford.

If one starts with a K3 surface over \(\QQ\text{,}\) its geometric Picard number can only increase under specialization, as the Néron–Severi lattice of a characteristic 0 K3 surface injects into the Néron–Severi lattice of a reduction. So in principle, we could try to prove that the geometric Picard number is 1 by reduction to a finite field. But there's a catch: the polynomial \(Q\) has integer coefficients and its degree is even (22), so its geometric Picard number is always even (the noncyclotomic part necessarily has even degree).

This seems to be the end of the story, until we realize (as van Luijk did) that we can apply the Artin–Tate formula at various different primes of good reduction and compare the answers. To wit, van Luijk constructs a family of K3 surfaces over \(\ZZ\) whose reductions at 2 and 3 both have geometric Picard number 2 (the smallest possible value given the previous discussion). Using the Artin-Tate formula, he shows that the discriminants of the lattices in characteristics 2 and 3 are \(-12\) and \(-9\text{,}\) which represent different elements of \(\QQ^*/\QQ^{*2}\text{.}\) But if the Néron–Severi lattice over \(\ZZ\) were 2-dimensional, its discriminant would be the same modulo squares as each of these, as it would be a sublattice of full rank. As this cannot happen, the geometric Picard number must be 1.

For more results about the variation of Picard numbers under specialization, see [26], [21], [24].

Section 18.3 Distribution questions

Individual zeta functions can be unpredictable, but we can make headway looking at distributions of lots of them. Here are three different flavors of questions that people study.

Fix a finite field \(\FF_q\) and look at a class of varieties \(\{X\}\) over \(\FF_q\text{.}\) The zeta function of each variety is related to the number of points, so we can consider \(\# X(\FF_q)\) (or something related) as a random variable on the probability space of all such \(X\text{.}\)
Look at a geometric family of varieties, i.e., look at the fibers of a map \(X\rightarrow S\) over closed points of \(S\) where both \(X\) and \(S\) are varieties over \(\FF_q\text{.}\) Now that we're looking over all closed points, we'll also be counting \(\FF_{q^r}\) points.
Look at the same question for an arithmetic family \(X\rightarrow \Spec(\calO_K)\) for a number field \(K\text{.}\)

For an example of the first flavor, look at smooth plane curves over \(\FF_q\text{.}\) We want to understand \(\# X(\FF_q)\) viewed as a random variable on the probability space of all such \(X\text{.}\) For a fixed degree \(d\text{,}\) this is a finite probability space as there are finitely many curves, so we should really average over all \(d\) in some fashion. We take all \(d\) up to some bound, compute the distribution, then take the limit as the bound goes to infinity.

Theorem 18.3.1.

The resulting distribution is a sum of \(q^2+q+1\) individually independently distributed \(\{0,1\}\)-random variable with total mean \(q+1\text{.}\)

Sketch.

This is an application of Poonen's Bertini theorem [106] by Bucur–David–Feigon–Lalín [17]. Here's where this is coming from. The quantity \(q^2+q+1\) is the number of points on \(\PP^2(\FF_q)\text{.}\) We can think of each point as a variable, and ask if that point is a rational point of a given plane curve \(X\text{.}\) At the point 0, we can locally expand out our curve as being cut out by the equation \(a+bx+cy+\dots\text{.}\) So 0 is on the curve exactly when \(a=0\text{,}\) and it's a smooth point if \(a=0\) and \(bc\neq 0\text{.}\) If we exclude the case where \(a=b=c=0\) (by sieving), all other possibilities are equally likely. Now we have \(q^3-1\) total possibilities, of which \(q^2-1\) are good. So the probability that a given point is on a random curve is \((q+1)/(q^2+q+1)\text{,}\) and one uses Poonen's theorem to ensure that each point contributes independently to the count.

Remark 18.3.2.

Poonen's Bertini theorem asserts that given a smooth quasiprojective variety \(X\text{,}\) the probability that an ample hypersurface section of \(X\) is predicted by a product of local probabilities, each computing the probability that there is no failure of smoothness at a given point. It has spawned a sizable literature concerning questions of a similar flavor. Two notable examples are the papers of Bucur–Kedlaya [18], which extends Poonen's theorem by considering a complete intersection of multiple hypersurfaces (this came up previously in Remark 6.0.17), and of Erman–Wood [43], which allows the use of semiample hypersurfaces (at the cost of some degree of independence between points).

Remark 18.3.3.

The standard reference for questions of type 2 is the book of Katz–Sarnak [70]. We won't talk much about these today, except to say that they can generally be settled by combining Weil II with a computation of a certain monodromy group attached to the family of varieties. See Theorem 18.3.11 below for an example.

It's generally harder to prove anything for questions of type 3 than type 2, but we expect similar answers. In each case, assuming \(X\) is smooth and proper, we have \(Z(X,T)=\prod L_i(T)^{(-1)^{i+1}}\) where \(L_i(T)\) is pure of weight \(i\text{.}\) We normalize to get \(\overline{L}_i(T)=L_i(q^{-i/2}T)\text{,}\) which has eigenvalues on the unit circle.

Metaconjecture 18.3.4.

We expect the \(\overline{L}_i(T)\) to behave like characteristic polynomials of random matrices in a certain compact Lie group \(G\text{.}\) Here randomness is measured with respect to the (unique) Haar measure on \(G\text{.}\)

Remark 18.3.5.

This metaconjecture predates people thinking about finite fields. It was originally introduced to think about the Riemann zeta function, based on the idea (attributed to Pólya) that the zeroes of \(\zeta\) should be (up to rotation) the eigenvalues of some self-adjoint operator on some Hilbert space. Since we have no idea what this operator should look like, we might hope that it behaves like a “random” operator, and indeed evidence (from Montgomery, Odlyzko, and others) suggests that the distribution of zeroes of \(\zeta\) does have some features in common with the eigenvalues of suitable random matrices.

–\(\QQ\text{.}\)

Definition 18.3.6.

To formulate the Sato-Tate conjecture, let \(E/\QQ\) be an elliptic curve. For \(p\) a prime of good reduction, let \(a_p\) be the trace of Frobenius on \(E_{\FF_p}\text{;}\) the Hasse bound says that \(a_p \in [-2 \sqrt p,2 \sqrt p]\text{,}\) so we divide by \(\sqrt p\) to renormalize \(a_p\text{.}\) This lets us compare the values of \(a_p\) as \(p\) varies over all primes of good reduction.

Theorem 18.3.7.

Given an elliptic curve \(E/\QQ\text{,}\) as \(p\) varies over all primes of good reduction, the \(a_p/\sqrt p\) are equidistributed (see below) in \([-2,2]\) with respect to one of the following measures:

If \(E\) has CM, the trace of a random matrix in \(N(\SO(2),\SU(2))\) (the normalizer);
else, the trace of a random matrix in \(\SU(2)\text{.}\)

Remark 18.3.8.

For a gif of this theorem in action, see this animation by Drew Sutherland ¹. See also Sutherland's website ² for additional examples.

Definition 18.3.9.

The notion of equidistribution comes from ergodic theory. Let \(X\) be a measure space with measure \(\mu\text{,}\) and let \(x_1,x_2,\dots\) be a sequence in \(X\text{.}\) We say this sequence is equidistributed if for all continuous functions \(f:X\rightarrow \RR\text{,}\) we have

\begin{equation*} \int_\mu f=\lim_{N\rightarrow \infty} \frac{f(x_1)+\cdots+f(x_N)}{N}. \end{equation*}

The intuition here is that the left-hand side of this equation represents a “space average” of the function \(f\text{,}\) while the right-hand side represents a “time average” over a sequence of sample points.

This is now a (very very hard) theorem of Clozel–Harris–Taylor [22], Harris–Shephard-Barron–Taylor [62], and many more using automorphic forms. These show up because the relevant \(X\) comes from taking a Lie group \(G\) modulo conjugation, which gives it a natural measure coming from the Haar measure on \(G\text{.}\) When we look at a distribution problem on a space of conjugacy classes, it suffices to test the equidistribution property for \(f=\chi\) an irreducible character by the Peter–Weyl theorem. For these, an argument of Serre (inspired by the prime number theorem; see [113], Chapter I, Appendix) shows that the limiting property follows from analytic continuation of suitable \(L\)-function.

We could ask the same question over other number fields \(K\text{.}\) There's a similar conjecture, except if \(K\) contains an imaginary quadratic field \(M\text{,}\) there's a third option when \(E\) has CM in \(M\text{.}\) Then the random matrices are in \(\SO(2)\text{.}\) The CM cases can be settled using Hecke's theory of Grössencharacters, so the real issue is the non-CM cases.

Theorem 18.3.10.

The analogue of the Sato–Tate conjecture is known when \(K\) is either a totally real number field, or a CM field (a totally imaginary quadratic extension of a totally real field, such as an imaginary quadratic field).

Proof.

The first case was settled by Barnet-Lamb–Geraghty–Gee [8]. The second case was settled by the “paper of 10 authors” [3].

For comparison, let us also formulate the corresponding type 2 question; this amounts to looking at all elliptic curves over finite extensions of some \(\FF_q\text{.}\)

Theorem 18.3.11.

Let \(f: X \to S\) be a family of elliptic curves over a finite-type \(\FF_q\)-scheme, and assume that the \(j\)-invariant is nonconstant in the family (that is, the family is nonisotrivial). Then as \(s\) varies over closed points of \(S\text{,}\) if we write \(a_s\) for the trace of Frobenius on \(f^{-1}(s)\text{,}\) the quantities \(a_s/\sqrt{\#\kappa(s)}\) are equidistributed with respect to the distribution of traces of random matrices in \(\SU(2)\text{.}\)

Proof.

This is proved by Deligne in his Weil II paper [31], Section 3.5.

Remark 18.3.12.

In more general cases, you figure out what Lie group to use (either conjecturally or provably) by looking at the image of monodromy in the case of geometric families, or the image of Galois in the case of arithmetic families.

In the number field case, you look at the associated Galois representation; look at how big the image might possibly be; check that you aren't missing constraints that could make the image smaller; and then hope that this actually is the image. For example, for CM elliptic curves, the image of Galois must respect the endomorphisms and is thus much smaller than for non-CM elliptic curves; this is then reflected in the distribution of the Frobenius traces.

By looking carefully at the constraints involved, one can sometimes identify all possible candidates for the distribution and for the compact Lie group (the Sato–Tate group) giving rise to it. For example, if \(X/K\) is a genus 2 curve over a number field, or an abelian surface, there are 52 possible Sato–Tate groups [46], and one can easily distinguish the resulting distributions numerically. For an abelian threefold, there are 410 possible groups [47]. For further discussion of the Sato–Tate conjecture and its generalizations, see [119].

Section 18.4 Tying everything together

We'll end with a remarkable application of random matrix theory to questions about point counting, which ties together our discussions of causality and randomness in zeta functions.

Remark 18.4.1.

We start with a remarkable fact from probability theory due to Diaconis–Shahshahani [34]. Consider the \(k\)-th moment of the trace of a random matrix in the unitary group \(\mathrm{U}(n)\text{.}\) If we fix \(k\) and send \(n\) to \(\infty\text{,}\) we might expect the moment to grow; after all, we're taking the trace of a really big matrix! But this doesn't happen: the \(k\)-th moment stabilizes for \(n\geq k\text{.}\) In particular, a random matrix has bounded trace! Similar results hold for orthogonal and symplectic groups.

What does this mean for zeta functions? Let \(X\) be a smooth proper scheme over \(\FF_q\text{,}\) and factor the \(i\)-th piece of the zeta function \(L_i(T)=\det(1-FT,H^i(X))\) into a causal part (the Tate classes if \(i\) is even, otherwise nothing) and a random part \(P_r(T)\) (everything else). Then if the renormalized polynomial \(P_r(q^{-i/2} T)\) really corresponds to a random matrix (say in the unitary group \(\mathrm{U}\) or the unitary symplectic group \(\mathrm{USp}(n)\text{,}\) the trace should be fairly small even if the degree is large. It's therefore reasonable to expect that as the degree gets big, the zeta function will be dominated by the causal factors.

One application of this logic is a heuristic prediction about the distribution of the number of rational points of \(\# X(\FF_q)\) as \(X\) varies over all curves of a given genus [2]. This prediction involves point counts on \(M_{g,n}\text{,}\) which has a causal part (the stable cohomology) and a non-causal part (the unstable cohomology). If we predict that the unstable part should act randomly, then the causal part will dominate.

math.mit.edu/~drew/g1_r28_a1f.gif

math.mit.edu/~drew/g1SatoTateDistributions.html