Modular Forms I

This is the first in a series of posts where I try to figure out what exactly I know about modular forms. The first three or so will follow these notes very closely. This post will be an extended discussion of the motivation for the definition of modular forms and their most basic properties. Given the behemoth that modular forms are, even future posts will barely scratch the surface, but I would like to at least put what I have read in order.

For me, the starting point of the theory is this: It turns out that surfaces of finite type have a complex structure, and the uniformisation theorem tells us exactly what their universal covers are (up to biholomorphism). One can then prove that the only surfaces uniformised by the complex plane itself are tori, also known as elliptic curves. These arise as quotients of $\mathbb{C}$ by a lattice $\Lambda = \mathbb{Z}\tau_1 \oplus \mathbb{Z}\tau_2$, for $\tau_1, \tau_2$ complex numbers which are linearly independent over $\mathbb{R}$. For simplicity, since $\mathbb{Z} \tau_1 = \mathbb{Z} (-\tau_1)$, there's no harm in assuming that $\tau_1 / \tau_2 \in \mathbb{H}$.

In this viewpoint, one can study elliptic curves by studying lattices $\Lambda \subset \mathbb{C}$. Modular forms will correspond to certain functions of lattices, and by extension, to certain functions of elliptic curves.
Note however that there exist many ways to generate the same lattice: $(\tau_1, \tau_2)$ and $(\tau_1', \tau_2')$ define the same lattice exactly when
\[
(\tau_1', \tau_2') = (a \tau_1 + b \tau_2,\ c \tau_1 + d \tau_2)
\]
where $\begin{pmatrix} a & b \\ c & d \end{pmatrix} \in \mathrm{SL}_2(\mathbb{Z})$. Hence, if we want to consider functions on lattices, they had better be invariant under $\mathrm{SL}_2(\mathbb{Z})$.

Suppose we have a function
\[
F: \{\text{Lattices}\} \to \mathbb{C}.
\]
First, observe that multiplying a lattice by a non-zero scalar (i.e., $\lambda \Lambda$ for $\lambda \in \mathbb{C}^\times$) amounts to rotating and rescaling the lattice. In fact, since we really care about elliptic curves, and $\mathbb{C}/\Lambda \cong \mathbb{C}/\lambda \Lambda$ under the isomorphism $z \mapsto \lambda z$, $F$ should be completely invariant under such rescalings — i.e., we should insist that
\[
F(\lambda \Lambda) = F(\Lambda).
\]
However, if we define $F$ like this, we are forced to insist that $F$ has no poles. This is needlessly restrictive. So instead, we require that
\[
F(\lambda \Lambda) = \lambda^{-k} F(\Lambda)
\]
for some integer $k$; the quotient $F/G$ of two weight $k$ functions gives a fully invariant function, this time with poles allowed.

If $\Lambda = \mathbb{Z} \tau \oplus \mathbb{Z}$ with $\tau \in \mathbb{H}$, define a function $f: \mathbb{H} \to \mathbb{C}$ by
\[
f(\tau) = F(\Lambda).
\]
For a general lattice, we have:
\[
F(\mathbb{Z} \tau_1 \oplus \mathbb{Z} \tau_2) = F\left(\tau_2\left( \mathbb{Z} \left(\frac{\tau_1}{\tau_2}\right) \oplus \mathbb{Z} \right)\right) = \tau_2^{-k} f\left(\frac{\tau_1}{\tau_2}\right),
\]
and in particular,
\[
f(\tau) = F(\mathbb{Z} \tau \oplus \mathbb{Z}) = F(\mathbb{Z}(a \tau + b) \oplus \mathbb{Z}(c \tau + d)) = (c \tau + d)^{-k} f\left( \frac{a \tau + b}{c \tau + d} \right)
\]
by $c$ invariance.

This leads to a different point of view. Consider the upper half plane and add in the point at infinity as well as its orbit under the action under $\mathrm{SL}_2(\mathbb{Z})$, i.e. all the rational points on the real line. The quotient of this, a compactification of $\mathbb{H} / \mathrm{SL}_2(\mathbb{Z})$, is what's known as a modular curve and we will denote this by $X$.

We would like to say that modular forms correspond to holomorphic functions on $X$. However, for that we would want these functions to be invariant, i.e., satisfy $f\left( \frac{a \tau + b}{c \tau + d} \right) = f(\tau)$. This isn't exactly the condition for modular forms (unless $k = 0$). Instead, it turns out that modular forms correspond to different objects on $X$, namely \emph{differentials} (which are like higher-order differential forms). We denote them by $f(z)(dz)^\ell$.

The difference from plain functions on $X$ is how they behave under the change of coordinates: writing $z = g(w)$ for some holomorphic function $g$, we expect to have
\[
f(z)(dz)^\ell = f(g(w))(dg(w))^\ell = f(g(w))(g'(w)\,dw)^\ell = f(g(w))\,g'(w)^\ell (dw)^\ell.
\]
In particular, for $g(w) = \dfrac{aw + b}{cw + d}$ with $\begin{pmatrix} a & b \\ c & d \end{pmatrix} \in \mathrm{SL}_2(\mathbb{Z})$, we have $g'(w) = (cw + d)^{-2}$, hence
\[
f(z)(dz)^\ell = (cw + d)^{-2\ell} f\left( \frac{aw + b}{cw + d} \right) (dw)^\ell.
\]
If we want this object to be defined on $w$, we would like the above to be invariant under the change $w \mapsto z$. Thus, we would like
\[
f(w)(dw)^\ell = (cw + d)^{-2\ell} f\left( \frac{aw + b}{cw + d} \right)(dw)^\ell,
\]
i.e.,
\[
f(w) = (cw + d)^{-2\ell} f\left( \frac{aw + b}{cw + d} \right),
\]
which is again the modular form condition, but for weight $2\ell$. Hence, at least for even weights, we can interpret modular forms as analytic objects defined on a certain Riemann surface. 

Given the above discussion, we can finally write down a definition of modular forms which is hopefully motivated.

Definition: A modular form of weight $k$ (and level 1) is a holomorphic function $f: \mathbb{H} \to \mathbb{C}$ satisfying

  • Automorphy: \[
    f(\tau) = F(\mathbb{Z} \tau \oplus \mathbb{Z}) = F(\mathbb{Z}(a \tau + b) \oplus \mathbb{Z}(c \tau + d)) = (c \tau + d)^{-k} f\left( \frac{a \tau + b}{c \tau + d} \right)
    \]
  • Growth: for any $\gamma \in  \mathrm{SL}_2(\mathbb{Z})$ the function $(c \tau + d)^{-k} f(\gamma(\tau))$ is bounded as im$\tau \to \infty$

We'll come back to what level 1 means. The second condition is called being holomorphic at the cusp and turns out to be simply the condition that this function is holomorphic at one of the points of the modular curve. If in addition it tends to zero at the cusp then the form is called a cusp form.

An urban legend tells the story of a researcher who studied anti-metric spaces: ones where the inequality sign in the triangle inequality is reversed. He managed to prove many amazing properties about such spaces, until it was pointed out that such a space can contain at most one point.

The definition of modular forms forces such functions to exhibit a lot of symmetry, which will be useful in later calculations. However, to avoid repeating others' mistakes, it would be prudent to first exhibit a non-trivial example of a modular form after giving the definition. Eisenstein series give some of the simplest examples of modular forms.

Let $k\geq 3$ be an integer and $\tau \in \mathbb{H}$. Define the Eisenstein series of order $k$ to be $G_k(\tau)= \sum_{(n,m) \in \mathbb{Z}^2 \backslash (0,0)} \frac{1}{(n+m\tau)^k}$.
\begin{thm}
Eisenstein series have the following properties:

  • $G_k(\tau)$ converges absolutely for $k \geq 3$ and is a holomorphic function of $\tau$
  • $G_k(\tau) \equiv 0$ if $k$ is odd
  • $E_k(\tau)$ satisfies $E_k(\tau+1)=E_k(\tau)$ and $E_k(\tau)=\tau^{-k}E_k(-1/\tau)$

One could prove these with complex analysis.

Now we are going to describe all modular forms! For $k \in \mathbb{Z}$, we write $M_k = M_k(\Gamma(1))$\index{$M_k$}\index{$M_k(\Gamma(1))$} for the set of modular forms of weight $k$ (and level $1$). We have $S_k \subseteq S_k(\Gamma(1))$\index{$S_k$}\index{$S_k(\Gamma)$} containing the cusp forms. These are $\mathbb{C}$-vector spaces, and are zero for odd $k$.

Moreover, from the definition, we have a natural product
\[
  M_k \cdot M_\ell \subseteq M_{k + \ell}.
\]
Likewise, we have
\[
  S_k\cdot M_\ell \subseteq S_{k + \ell}.
\]
We let\index{$M_*$}\index{$S_*$}
\[
  M_* = \bigoplus_{k \in \mathbb{Z}} M_k,\quad S_* = \bigoplus_{k \in \mathbb{Z}} S_k.
\]
Then $M_*$ is a graded ring and $S_*$ is a graded ideal. By definition, we have
\[
  S_k = \ker (a_0: M_k \to \mathbb{C}).
\]
To figure out what all the modular forms are, we use the following constraints on the zeroes of a modular form:

Proposition Let $f$ be a weak modular form (i.e.\ it can be meromorphic at $\infty$) of weight $k$ and level $1$. If $f$ is not identically zero, then
  \[    \left(\sum_{z_0 \in \mathcal{D} \setminus \{i, \rho\}} ord_{z_0} (f)\right) + \frac{1}{2} ord_i(f) + \frac{1}{3} ord_\rho f + ord_\infty(f) = \frac{k}{12},
  \]
  where $ord_\infty f$ is the least $r \in \mathbb{Z}$ such that $a_r(f) \not= 0$.

Note that if $\gamma \in \Gamma(1)$, then $j(\gamma, z) = cz + d$ is never $0$ for $z \in \mathbb{H}$. So it follows that $ord_z f = ord_{\gamma(z)} f$.

One could prove this by integrating along a suitable contour and using the argument principle.

Corollary  If $k < 0$, then $M_k = \{0\}$.

Corollary  If $k = 0$, then $M_0 = \mathbb{C}$, the constants, and $S_0 = \{0\}$.

Proof If $f \in M_0$, then $g = f - f(1)$. If $f$ is not constant, then $ord_i g \geq 1$, so the LHS is $>0$, but the RHS is $=0$. So $f \in \mathbb{C}$. Of course, $a_0(f) = f$. So $S_0 = \{0\}$. $\blacksquare$



Corollary  \[
    \dim M_k \leq 1 + \frac{k}{12}.
  \]
 In particular, they are finite dimensional.


Proof We let $f_0, \cdots, f_d$ be $d + 1$ elements of $M_k$, and we choose distinct points $z_1, \cdots, z_d \in \mathcal{D} \setminus \{i, \rho\}$. Then there exists $\lambda_0, \cdots, \lambda_d \in \mathbb{C}$, not all $0$, such that
  \[
    f = \sum_{i = 0}^d \lambda_i f_i
  \]
  vanishes at all these points. Now if $d > \frac{k}{12}$, then LHS is $> \frac{k}{12}$. So $f \equiv 0$. So $(f_i)$ are linearly dependent, i.e.\ $\dim M_k < d + 1 \blacksquare$.



Corollary  $M_2 = \{0\}$ and $M_k = \mathbb{C} E_k$ for $4 \leq k \leq 10$ ($k$ even). We also have $E_8 = E_4^2$ and $E_{10} = E_4 E_6$.


Proof: Only $M_2 = \{0\}$ requires proof. If $0 \not= f \in M_2$, then this implies
  \[
    a + \frac{b}{2} + \frac{c}{3} = \frac{1}{6}
  \]
  for integers $a, b, c \geq 0$, which is not possible. Alternatively, if $f \in M_2$, then $f^2 \in M_4$ and $f^3 \in M_6$. This implies $E_4^3 = E_6^2$, which is not the case as we will soon see. Note that we know $E_8 = E_4^2$, and is not just a multiple of it, by checking the leading coefficient (namely 1). $\blacksquare$

Corollary  The cusp form of weight $12$ is
  \[
    E_4^3 - E_6^2 = (1 + 240 q + \cdots)^3 - (1 - 504 q + \cdots)^2 = 1728q + \cdots.
  \]

Note that $1728 = 12^3$.

Define  \[
    \Delta = \frac{E_4^3 - E_6^2}{1728} = \sum_{n \geq 1} \tau(n) q^n \in S_{12}.
  \]

The function $\tau$ is very interesting, and is called \term{Ramanujan's $\tau$-function}. It has nice arithmetic properties that we may discuss at some point.



  We have
  \[
    \sum_{z_0 \not= i, \rho} ord_{z_0} \Delta + \frac{1}{2} ord_i \Delta + \frac{1}{3} ord_\rho \Delta + ord_\infty \Delta = \frac{k}{12} = 1.
  \]
  Since $ord_\rho \Delta = 1$, it follows that there can't be any other zeroes and hence that $\Delta(z) \not= 0$ for all $z \in \mathbb{H}$.

It follows from this that

Proposition The map $f \mapsto \Delta f$ is an isomorphism $M_{k - 12}(\Gamma(1)) \to S_k(\Gamma(1))$ for all $k > 12$.


Proof Since $\Delta \in S_{12}$, it follows that if $f \in M_{k - 1}$, then $\Delta f \in S_k$. So the map is well-defined, and we certainly get an injection $M_{k - 12} \to S_k$. Now if $g \in S_k$, since $ord_\infty \Delta = 1 \leq ord_\infty g$ and $\Delta \not= \mathbb{H}$. So $\frac{g}{\Delta}$ is a modular form of weight $k - 12$. $\blacksquare$


We can put all this together to prove:
Theorem 

  1.    If $k > 4$ and even, then
          \[
            M_k = S_k \oplus \mathbb{C} E_k.
          \]
  2. We have \[ \dim M_k (\Gamma(1)) =  \begin{cases}0 & k < 0\text{ or }k \text{ odd }\\  \left\lfloor \frac{k}{12}\right\rfloor & k > 0, k \equiv 2 \pmod{12}\\ 1 + \left\lfloor \frac{k}{12}\right\rfloor & \text{otherwise}\end{cases} \] 
  3.  Every element of $M_k$ is a polynomial in $E_4$ and $E_6$.
  4. Let

      \[
        b =
        \begin{cases}
          0 & k\equiv 0 \pmod 4\\
          1 & k\equiv 2 \pmod 4
        \end{cases}.
      \]
      Then
      \[
        \{h_j = \Delta^j E_6^b E_4^{(k - 12j - 6b)/4} : 0 \leq j < \dim M_k\}.
      \]
      is a basis for $M_k$, and
      \[
        \{h_j : 1 \leq j < \dim M_k\}
      \]
      is a basis for $S_k$.
Proof 1: $S_k$ is the kernel of the homomorphism $M_k \to \mathbb{C}$ sending $f \mapsto a_0(f)$. So the complement of $S_k$ has dimension at most $1$, and we know $E_k$ is an element of it. So we are done.

2: For $k < 12$, this agrees with what we have already proved. By the proposition, we have
      \[
        \dim M_{k - 12} = \dim S_k.
      \]
      So we are done by induction and 1.
3 This is true for $k < 12$. If $k \geq 12$ is even, then we can find $a, b \geq 0$ with $4a + 6b = k$. Then $E_4^a E_6^b \in M_k$, and is not a cusp form. So
      \[
        M_k = \mathbb{C} E_4^a E_6^b \oplus \Delta M_{k - 12}.
      \]
      But $\Delta$ is a polynomial in $E_4, E_6$, So we are done by induction on $k$.
4: By 2, we know $k - 12j - 6k \geq 0$ for $j < \dim M_k$, and is a multiple of $4$. So $h_j \in M_k$. Next note that the $q$-expansion of $h_j$ begins with $q^j$. So they are all linearly independent. $\blacksquare$

Now we explain what level 1 means. For every positive integer $N$ the group $\mathrm{SL}_2(\mathbb{Z})$ admits a homomorphism to $\mathrm{SL}_2(\mathbb{Z/N})$just by reducing the entries mod $N$, and the kernel of this map is denoted $\Gamma(N)$. This makes $\mathrm{SL}_2(\mathbb{Z})$ $\Gamma(1)$. In fact, for any finite index subgroup $\Gamma \leq \Gamma(1)$, one can define modular forms on $\Gamma$ by an analogous process. The automorphy condition is replaced by transforming appropriately under group elements of $\Gamma$ and there exist analogous results, e.g. dimension formulae, for these modular forms.  

We close by mentioning one final interpretation of modular forms: as sections of certain line bundles. A modular curve is a Riemann surface constructed as a quotient of the upper half plane by the action of a congruence subgroup. These parametrise isomorphism classes of elliptic curves with some additional structure.


Let $X$ be a modular curve and $\mathcal{L}$ a line bundle on $X$. Write the curve as $X = \mathbb{H}/\Gamma$, where $\Gamma \subset \mathrm{PGL}(2, \mathbb{R})$ acts freely on the upper half-plane $\mathbb{H}$.

The pullback of $\mathcal{L}$ to $\mathbb{H}$ is the trivial line bundle $\mathbb{H} \times \mathbb{C}$. This implies that $\mathcal{L}$ is the quotient of $\mathbb{H} \times \mathbb{C}$ by $\Gamma$ acting via
\[\gamma \cdot (\tau, z) = (\gamma \tau, e_\gamma(\tau) z),\]
where the map $\gamma \mapsto e_\gamma$ is a $1$-cocycle of $\Gamma$ with values in $\mathcal{O}(\mathbb{H})^\ast$
It follows that the sections of $\mathcal{L}$ correspond to functions $f: \mathbb{H} \to \mathbb{C}$ satisfying
\[f(\gamma \tau) = e_\gamma(\tau) f(\tau).\]

These functions can be considered modular forms in a generalized sense. In particular, if you take $\mathcal{L} = K_X^m$, the $m$-th power of the canonical bundle on $X$, then
\[
e_\gamma(\tau) = (c \tau + d)^{2m}
\quad \text{for } \gamma = \begin{pmatrix} a & b \\ c & d \end{pmatrix},
\]
and $f$ becomes a genuine modular form of weight $2m$ for $\Gamma$.

For a discussion of how all of this connects to arithmetic properties, see here.


Comments

Popular Posts