In this lecture, we’ll talk about the proof of the (Riemann hypothesis part of the) Weil conjectures and Deligne’s Weil II theorem. As this is far too deep a topic to cover thoroughly in one lecture, we instead describe a few of the main tools used in the proof and sketch their application. emphasizing analogies and intuition.
The major advance of Weil II over Weil I is to allow for cohomology with nonconstant coefficients. Roughly speaking, this will allow us to translate the Riemann hypothesis–-a statement about the cohomology of a simple sheaf on a complicated space–-into a statement about a complicated sheaf on a simple space. In particular, we will be able to induct on dimension and reduce consideration to one-dimensional spaces.
The original source is Deligne’s “Weil II” paper [31], but the use of the Fourier transform was introduced later by Laumon [84]. We follow most closely [73], which is written in terms of \(p\)-adic coefficients but can be translated fairly directly back to the \(\ell\)-adic side. See also [69] for an approach to “Weil II” in the style of “Weil I” [30].
Making this a little more precise, let’s start with a smooth connected variety \(X\) of dimension \(n\) over a finite field \(k\text{.}\) Then we’ll study a tower
\begin{equation*}
X=X_n\rightarrow X_{n-1}\rightarrow\cdots X_1\rightarrow k
\end{equation*}
where each map \(f_i:X_i\rightarrow X_{i-1}\) presents \(X_i\) as a family of curves over a space with dimension one lower. We want to understand the cohomology of \(X\) by summing over fibers of \(f=f_n:X\rightarrow X_{n-1}\) (the zeta function of \(X\) is literally a product over fibers of this map). The Leray spectral sequence gives a description of \(H^i(X,\overline{\QQ}_\ell)\) in terms of \(H^i(X_{n-1},R^j f_{*}\overline{\QQ}_\ell)\text{.}\) This in turn can be described in terms of cohomology of some sheaves on \(X_{n-2}\text{,}\) and so on; this dévissage allows us to reduce the general problem to understanding families of curves. Note that the Weil conjectures for curves tell us what we need to know about \(H^i(X_{n-1},R^j f_{*}\overline{\QQ}_\ell)\text{,}\) but for the remaining steps we really need Weil II because we start already with nontrivial coefficients.
As reported earlier (Definition 16.1.8), the higher direct images \(R^i f_* \calF\) of a lisse \(\overline{\QQ}_\ell\)-sheaf do not always exist in the category of lisse \(\overline{\QQ}_\ell\)-sheaves, but only in some larger category of constructible \(\overline{\QQ}_\ell\)-sheaves. However, any object in the constructible object defines a stratification on its space, and it “looks lisse” on each stratum (but the rank may vary between strata). In particular, there is always a dense open subset on which it restricts to something lisse. More precisely, given a morphism \(f\colon X \to S\) and a lisse \(\overline{\QQ}_\ell\)-sheaf \(\calF\) on \(X\text{,}\) there is always an open dense subset \(U\) of \(S\) on which the higher direct images \(R^i f_* \calF\) are lisse and their formation commutes with arbitrary base change, in particular to a point. That means that these objects really are computing “cohomology in fibers”.
Of course, in the previous discussion we cannot simply throw away the part of \(S\) where the higher direct images of \(\calE\) are not lisse. We need a nearby cycles formalism or vanishing cycles formalism to extract information about the fibers over \(S-U\) from the fibers over \(U\text{.}\)
Here’s the classical picture to keep in mind. Imagine we have a family of complex varieties over a base with a singularity over some degenerate point. Then we should be able to understand the cohomology of the singular fiber by removing it and looking at the cohomology of all the “nice” things surrounding it. Imagine looping around this bad fiber and looking at a homology class of the good fibers surrounding it. As you go around this bad fiber, you’ll get a different class once you get to the end of a loop: this gives a monodromy action on the cohomology. (This picture is sometimes called a Milnor fiber.)
The intuition is that the cohomology of an “exceptional fiber” can be recovered from its neighbors by looking at monodromy invariants. So even if we don’t understand \(S-U\text{,}\) we can try understand \(U\) near a point of \(S\text{,}\) and see what happens. The idea of vanishing cycles is that there are cycles that make sense in \(U\text{,}\) but as we move towards the exceptional point they get smaller and smaller until they vanish.
Say \(X\) is a curve, and \(\calE\) is a lisse sheaf on \(X\setminus\{s\}\) for some point \(s\text{,}\) so that \(\calE\) corresponds to a representation of \(\pi_1(X\setminus \{s\})\text{.}\) There is a surjection \(\pi_1(X\setminus \{s\})\rightarrow \pi_1(X)\text{,}\) but this won’t be injective as we have covers coming from looping around \(s\text{.}\) Picking a geometric base point \(\overline{x}\text{,}\) we have an exact sequence
where \(I_s\) is defined by this sequence and called the inertia group at \(s\). The number theory analogue is the inertia group in Galois theory. Roughly speaking, \(I_s\) is keeping track of the new cohomology coming from loops around \(s\text{.}\) (As an aside, this is closely related to the discussion of missing Euler factors in the \(L\)-function associated to a variety over a number field, from Remark 9.2.7.)
The \(p\)-adic analogue of inertia is much closer to differential geometry: it’s related to actual monodromy of differential equations. Because a \(p\)-adic coefficient object is a vector bundle with connection, you can actually try to solve differential equations using power series (possibly after adjoining some extra ring elements, as in differential Galois theory) and study monodromy that way.
If you’re trying to make some kind of estimate about a zeta function, you need to control the dimensions of the spaces \(H^i(X,\calF)\text{.}\) It’s therefore good to know things about Euler characteristics, as these are easier to control but still retain some of the needed information.
Let \(X\) be a curve over \(k\) with smooth compactification \(\overline{X}\text{.}\) Let \(\calF\) be a lisse sheaf (or a \(p\)-adic coefficient) on \(X\text{.}\) We define the Euler characteristic to be
We’d expect this to be related to the Euler characteristic of \(X\text{,}\) which is \(\chi(X)=\chi(\overline{\QQ}_\ell)\text{,}\) and should also be related to \(\calF\) somehow. A natural guess is
Let \(f\colon Y\rightarrow X\) be a finite étale morphism of curves and put \(\calF\colonequals f_*\overline{\QQ}_\ell\text{.}\) Then \(\chi(\calF)=\chi(Y)\text{,}\) which doesn’t agree with our guess! There’s a correction factor given by Riemann–Hurwitz, coming from the ramification at points of \(\overline{X}\setminus X\text{.}\) These factors can be computed locally, so we can look at them one at a time. In the étale case, this is essentially the Artin conductor of the inertia representation, which detects only what is happening at a single bad point.
This suggests that the shape of the correct formula is
\begin{equation*}
\chi(\calF)=\chi(X)\rank(\calF)+\sum_{x\in\overline{X}-X} (\text{correction at x})
\end{equation*}
where the correction term depends only on what is happening at \(x\) (say, on the formal completion at \(x\)). This is true, but we will not give a more precise formulation here.
Let us recall how weights were defined in the previous lecture. Let \(X\) be a curve over \(k=\FF_q\) and fix an embedding \(\iota\colon \overline{\QQ}_\ell\hookrightarrow \CC\) or \(\iota\colon \overline{\QQ}_p\hookrightarrow \CC\text{.}\) Given a lisse \(\overline{\QQ}_\ell\)-sheaf \(\calF\) on \(X\text{,}\) we say that \(\calF\) is \(\iota\)-pure of weight \(w\) if for all \(x\in X\text{,}\) the eigenvalues of \(\Frob\) on \(\calF_x\) have \(\iota\)-absolute values \(\#\kappa(x)^{w/2}\text{.}\) We say that \(\calE\) is \(\iota\)-mixed of weight \(\leq w\) or \(\geq w\) if a corresponding condition holds.
It is not clear that one expects to be able to impose much effective control on these weights: we are manipulating \(\ell\)-adic objects and attempting to keep track of archimedean information, and the two are not very compatible. However, Deligne proves a key theorem that imposes some control on the situation.
Suppose \(\calF\) is \(\iota\)-real (that is, all of the Frobenius charpolys have coefficients in \(\iota^{-1}(\RR)\)). Then the irreducible subquotients (using Jordan–Hölder filtrations because we’re in an abelian category) of \(\calF\) are each \(\iota\)-pure of some weight.
Somehow, if you are able to force the real numbers into the picture, really nice things happen. The idea comes from an argument of Rankin about modular forms, a real analysis argument which boils down to the fact that squares of real numbers are nonnegative. This is useful for us because the next lemma says that any pure coefficient can be written as a subquotient of something real. So studying real coefficients isn’t as arbitrary as it may seem.
If \(\calF\) is \(\iota\)-pure of weight 0, then \(\calF^\vee\oplus \calF\) is \(\iota\)-real. More generally, if \(\calF\) is \(\iota\)-pure of some other weight, then some twist of \(\calF\oplus \calF^\vee\) is \(\iota\)-real.
In the weight 0 case, all of the Frobenius eigenvalues have complex norm 1, so live on the unit circle. The coefficients of the characteristic polynomials are all symmetric functions in the Frobenius eigenvalues, so they will be real if the set of eigenvalues is stable under complex conjugation. Because we’re living on the unit circle, this is the same as the eigenvalues being stable under inverses. The eigenvalues of \(\calF^\vee\) are precisely the inverses of the eigenvalues of \(\calF\text{,}\) and the eigenvalues of \(\calF^\vee\oplus \calF\) is the disjoint union of the eigenvalues of \(\calF^\vee\) and \(\calF\text{,}\) so we are all set.
To prove Theorem 17.4.2, we first guess what the weight should be, then use this guess to prove that things actually work. Let \(\calE\) be an irreducible subquotient of \(\calF\) of rank \(r\text{.}\) We want to show that \(\calE\) is \(\iota\)-pure of some weight \(w\text{.}\) If this were true, then \(\wedge^r \calE\) would be an \(\iota\)-pure, rank \(1\) object of weight \(rw\text{.}\) The following key lemma will let us understand rank-1 objects nicely.
By geometric class field theory, any rank-1 coefficient on a curve corresponds to a character, which can be explicitly written as a constant times a finite order character. So any eigenvalue will be a constant times a root of unity. As roots of unity always have weight 0, the weights of the eigenvalues are all just given by the weight of the constant, so the coefficient must be pure.
The hard part is then to show that the determinantal weights of \(\calF\) actually behave like weights with respect to operations like \(\oplus\text{.}\) This takes plenty of work, but eventually we can find some inequality between our guess and reality, then use positivity and duality to flip the inequality and get things on the nose.
The next step is to combine Theorem 17.4.2 with a Fourier transform construction. The key case is for \(\AAA^1\text{,}\) because we’re just working with curves, and you can use the following trick in characteristic \(p\text{.}\) If you take \(x\mapsto x^p+1/x\text{,}\) this gives a finite étale cover \(\GG_m\rightarrow \AAA^1\) of degree \(p+1\text{.}\) This type of thing lets us shove all of the missing points to a single point at \(\infty\text{,}\) so if we were thinking about \(\PP^1 \setminus {s_1,\dots,s_k}\text{,}\) we can replace it with \(\AAA^1\text{.}\)
The previous argument shows for example that Belyi’s theorem goes out the window in positive characteristic, unless one does something like restrict to tamely ramified maps.
Now the really rough idea of Fourier transforms is to take a function \(f(x)\text{,}\) multiply it by \(e^{-2\pi ix\eta}\text{,}\) and integrate with respect to \(x\) to get a new function in terms of \(\eta\text{.}\) In more geometric terms, you start with a function on \(\RR\text{,}\) pull back to \(\RR \times \RR\text{,}\) twist by a biadditive character, then project on the second factor.
Translating this idea into our language, we’ll start with a coefficient object on \(\AAA^1\text{,}\) pull it back to \(\AAA^1\times \AAA^1\) along the first projection, twist by the Artin–Schreier cover to get a family of coefficients that we mostly understand, then project onto the second factor. Because we rigged our cover so that we understand all of the coefficients in the family besides the original one, the fibers over this second projection are copies of \(\AAA^1\) with coefficient objects that we understand away from a single point. This is the kind of thing that Deligne’s theory of weights is good at dealing with, so we’re now in a good situation; we then use nearby cycles to recover information about the original sheaf. (It is in this last step that we are forced to get something mixed rather than pure.)
In the \(p\)-adic world, a coefficient looks like a module over a Weyl algebra, a noncommutative ring containing both multiplication by a coordinate \(x\) and differentiation \(\frac{d}{dx}\) in the same coordinate. There, the Fourier transform can be effected by interchanging these two variables (up to a sign).
A slogan for this argument is that one doesn’t prove Weil II for a single sheaf in isolation. Instead, one proves something about a whole collection of sheaves at once.