The Markov equation

 A Markov number (there are different ways of transliterating from Russian, e.g. Markoff, but I will standardise to Markov) is a positive integer $x, y$ or $z$ that is part of a solution to the Markov Diophantine equation
\[x^{2}+y^{2}+z^{2}=3xyz\]
At a glance this might not seem like a particularly noteworthy equation, but it turns out to have connections to a huge number of interesting questions in both geometry and number theory. This post will discuss some of the many number theoretic aspects. For more details see either the survey by Gamburd or the book by Aigner, which I consulted when writing this post. The geometric side is exposed in this article by Series as well as Gamburd's survey.

Initial observations

 Fixing $x_2$ and $x_3$ makes the Markov equation a quadratic equation in $x_1$. The sum of the two solutions to this quadratic equation is $3x_2x_3$. In other words, the map \[(x_1,x_2,x_3)\mapsto (3x_2x_3-x_1,x_2,x_3),\] which is called a Vieta involution, preserves Markov triples. 

The first equality says that $m_1'$ is an integer, and the second that $m_1'$ is positive. Hence $(m_1', m_2, m_3)$ is another Markov triple, and $m_1'$ another Markov number. Repeating with $m_2$ and $m_3$, we see that for every nonsingular triple $(m_1, m_2, m_3)$ we get three others, called the neighbouring triples:

\[(m_1' = 3m_2m_3 - m_1, m_2, m_3),\]
\[(m_1, m_2' = 3m_1m_3 - m_2, m_3),\]
\[(m_1, m_2, m_3' = 3m_1m_2 - m_3).\]

Assume, without loss of generality, that $m_1 > m_2 > m_3$. Then
\[m_2' > m_1 > m_3, \quad m_3' > m_1 > m_2.\]
Furthermore,
\[f(m_2) = m_2^2 - 3m_2^2m_3 + m_2^2 + m_3^2 = 2m_2^2 + m_3^2 - 3m_2^2m_3 < 0\]
implies that $m_2$ lies between $m_1$ and $m_1'$; hence
\[m_2 > m_1', m_3\]
holds for the first neighbour.

Now we set up the Markov tree. Looking at the nonsingular triple $(m_1, m_2, m_3)$ and its neighbours we see that  
\[m_3' > m_2' > m_1 > m_2.\]  
Hence we get four different triples. And now comes the crucial observation: Two of the neighbours have $\textit{larger}$ maximum (namely $m_2', m_3'$) than $m_1$ in the triple $(m_1, m_2, m_3)$, and one neighbour has $\textit{smaller}$ maximum (namely $m_2$).  

We write the maximum in the middle and underline it, and start with $(1, \underline{1}, 1)$, $(1, 2, 1)$, and $(1, \underline{5}, 2)$. The recursive rule  

\[\ell, \underline{m}, r\]
\[\ell, \underline{3\ell m - r, m} \quad \underline{m, 3mr - \ell, r}\]

continues the tree downwards with ever increasing maxima. In words: Take the left pair $(\ell, m)$, apply the neighbouring triple construction, go to the left child, write $\ell, m$ in the same order as outer numbers of the new triple, and put the new maximum $3\ell m - r$ in the middle. In the same way, we take the right pair $(m, r)$ and proceed to the right child. The third neighbour $(\ell, r, 3\ell r - m)$ is, of course, the parent of $(\ell, m, r)$ one row up. It may be $(\ell, r, 3\ell r - m)$ or $(3\ell r - m, \underline{\ell}, r)$, depending on which of $\ell$ or $r$ is larger.

$\textbf{Theorem 1:}$ All nonsingular Markov triples appear exactly once in the Markov tree $T_M$

$\textbf{Proof:}$ Suppose $(a, m, b)$ is a nonsingular triple with maximum $m$. There is exactly one neighbour with smaller maximum $a$ or $b$, namely $(a, b, 3ab - m)$ if $b > a$ respectively $(3ab - m, a, b)$ if $a > b$. Going back in this way, we decrease the maximum each time and end up eventually at $(1, 5, 2)$ or $(2, 5, 1)$, since this is the only triple with maximum 5. Retracing our steps in $T_M$ from $(1, 5, 2)$, we find that $(a, m, b)$ or $(b, m, a)$ is in the tree. Uniqueness is clear, since the neighbour with smaller maximum is uniquely determined, and we can argue by induction on the maximum. $\square$

$\textbf{Corollary 2:}$ Any two Markov numbers in a Markov triple are relatively prime.

$\textbf{Proof:}$ This is certainly true for the triples $(1, 1, 1)$, $(1, 2, 1)$, and $(1, 5, 2)$. Equation (3.1) implies that a common divisor $d$ of any two numbers of a triple also divides the third. It follows then from (3.3) that $d$ divides the three numbers of the smaller neighbour as well. Going back in the tree, we eventually conclude that $d$ divides 1, 5, and 2, and so $d = 1$. $\blacksquare$

A related fact that one can check that (1,1,1) and (1,1,2) are the only solutions with repeated entries. Note also that the left branch gives rise to tuples $(1, F_{2n-1}, F_{2n-3})$ of Fibonacci numbers, and the right branch to tuples $(P_{2n-1}, P_{2n+1}, 2)$ of Pell numbers.

Prime factors


Almost all Markov numbers are composite. In fact, for any $v \geq 1$

\[\sum_{\substack{m \in M^s ;m \leq T\\ m \text{ has at most vdistinct prime factors}}} 1 = o \left( \sum_{\substack{m \in M^s \\ m \leq T}} 1 \right).\]

The proof of this relies on a very deep result of Mirzakhani on orbit equidistribution as well as other difficult theorems. A nice elementary fact that a secondary student could instead show about Markov numbers is that every prime factor $p$ has to be $\equiv 1 \mod{4}$ and if a Markov number is even then it is $\equiv 2 \mod{32}$. However, it is open and appears out of reach whether there are infinitely many square-free Markov numbers. It is also open whether every Markov number appears uniquely as the maximum number of a Markov tuple. Many variations of this question exist, all of which go by some name like the unicity conjecture.

Approximation results


Let $\alpha$ be an irrational number. A celebrated theorem of Hurwitz asserts that $\alpha$ admits infinitely many rational approximations $p/q$ such that $|\alpha - \frac{p}{q}| < \frac{1}{\sqrt{5}q^2}$ and, moreover, that if $\alpha$ is $\mathrm{GL}_2(\mathbb{Z})$-equivalent to the Golden Ratio $\theta_1 = (1 + \sqrt{5})/2$, in the sense that $\alpha = \frac{a\theta_1 + b}{c\theta_1 + d}$ for some integers $a, b, c, d$ with $ad - bc = \pm 1$, the above result is sharp and the constant $\frac{1}{\sqrt{5}}$ cannot be replaced by any smaller number.

Suppose next that $\alpha$ is not $\mathrm{GL}_2(\mathbb{Z})$-equivalent to $\theta_1$. Then a result of Markov's doctoral advisors, Korkine and Zolotareff, asserts that $\alpha$ admits infinitely many rational approximations $p/q$ such that $|\alpha - \frac{p}{q}| < \frac{1}{\sqrt{8}q^2}$, and, moreover, that the constant $\frac{1}{\sqrt{8}}$ is sharp if and only if $\alpha$ is $\mathrm{GL}_2(\mathbb{Z})$-equivalent to $\theta_2 = 1 + \sqrt{2}$.

The general result found by Markov is as follows:

$\textbf{Markov's Theorem:}$ Let $M = \{1, 2, 5, 13, 29, 34, 89, 169, 194, \ldots\}$ be the sequence of Markov numbers. There is a sequence of associated quadratic irrationals $\theta_i \in \mathbb{Q}(\sqrt{\Delta_i})$, where $\Delta_i = 9m_i^2 - 4$ and $m_i$ is the $i$-th element of the sequence, with the following property. Let $\alpha$ be a real irrational, not $\mathrm{GL}_2(\mathbb{Z})$-equivalent to any of the numbers $\theta_i$ whenever $m_i < m_j$. Then $\alpha$ admits infinitely many rational approximations $p/q$ with
\[|\alpha - \frac{p}{q}| < \frac{m_j}{\sqrt{\Delta_j} q^2};\]
the constant $m_j / \sqrt{\Delta_j}$ is sharp if and only if $\alpha$ is $\mathrm{GL}_2(\mathbb{Z})$-equivalent to $\theta_h$, for some $h$ such that $m_h = m_j$.


Integral points


Zagier showed that the sequence of Markov numbers is sparse in the integers. More precisely, that \[\sum_{\substack{m \in \mathbb{M}^s \\ m \leq T}} 1 \sim c (\log T)^2 \quad \text{as } T \to \infty \, (c > 0).\]

The Markov equation happens to be an example of a non-singular cubic, so naturally the algebraic geometers and number theorists would like to extend results about the Markov equation to other such cubics, starting with ones whose defining equation looks like the Markov equation. Zagier's result can be generalised to an asymptotic formula for the number of integer solutions to the Markov-Hurwitz equation

\[x_1^2 + x_2^2 + \cdots + x_n^2 = a x_1 x_2 \cdots x_n + k\]

Let $V$ be the subvariety of $\mathbb{C}^n$ cut out by this equation. Zagier's result and its generalisation can be viewed as a statement about asymptotic growth of integral points on the Markov variety, \( |V(\mathbb{Z}) \cap B(T)| \sim (\log T)^2 \). 

The issue of the existence of a single integral solution to the Markov-Hurwitz equation for general $a$ and $k$, even for $n = 3$, is quite subtle. For instance, it is not hard to show that when $k=0$ and $n=3$ there are solutions iff $a=1, 3$.  Little is known about the values at integers assumed by affine cubic forms$^{17}$ $F$ in three variables. For $k \neq 0$, set
\[V_{k,F} = \{x = (x_1, x_2, x_3) : F(x) = k\}. \]
The basic question is for which $k$ is $V_{k,F}(\mathbb{Z}) \neq \emptyset$, or, more generally, infinite or Zariski dense in $V_{k,F}$?

A prime example is $F = S$, the sum of three cubes,
\[S(x_1, x_2, x_3) = x_1^3 + x_2^3 + x_3^3.\]

There are obvious local congruence obstructions, namely that $V_{k,S}(\mathbb{Z}) = \emptyset$ if $k \equiv 4, 5 \pmod{9}$, but beyond that, it is possible that the answers to all three questions are yes for all the other $k$'s, which we call the admissible values. One hope would be that the congruence obstructions are all there is. If all local (i.e. coming from restricting attention to a prime, e.g. in a quotient by or completion at a prime ideal) obstructions vanish, the simplest case would be that there is a solution which satisfies all the local equalities/conditions, which would give rise to many integer points. For instance, it is known that the map $q_n: \mathrm{SL}_n(\mathbb{Z}) \to \mathrm{SL}_n(\mathbb{Z}/q\mathbb{Z})$ is surjective for every $n$, which is known as strong approximation for the algebraic group $\mathrm{SL}_n$. This should be thought of as a souped up version of the Chinese remainder theorem. When instead one looks at completions at primes, such things go by the name of Hasse principles, after the Hasse-Minkowski theorem stating that every quadratic form has a solution in $\mathbb{Q}$ if and only if it has solutions in $\mathbb{R}$ and $\mathbb{Q}_p$ for every prime $p$ (analogous results hold for number fields).  It is known that strong approximation in its strongest form fails for $V_{k,S}(\mathbb{Z})$; the global obstruction coming from an application of cubic reciprocity. Moreover, it is also known that $V_{1,S}(\mathbb{Z})$ is Zariski dense in $V_{1,S}$.

Ghosh and Sarnak investigate the Markov form $F = M$,
\[M(x) = x_1^2 + x_2^2 + x_3^2 - x_1 x_2 x_3. \]
Let $\Gamma$ be the semigroup generated by
\[\gamma_1 = \begin{pmatrix}0 & 1 & 0 \\0 & 0 & 1 \\1 & 1 & 1\end{pmatrix},\quad \gamma_2 = \begin{pmatrix}1 & 0 & 0 \\0 & 0 & 1 \\1 & 1 & 1\end{pmatrix},\quad \gamma_3 = \begin{pmatrix}1 & 0 & 0 \\0 & 1 & 0 \\1 & 1 & 1\end{pmatrix}.\]
Except for the case of the Cayley cubic with $k = 4$, $V_{k,M}(\mathbb{Z})$ decomposes into a finite number $b_M(k)$ of $\Gamma$-orbits. For example, if $k = 0$, then
\[b_M(k) = 2\]
corresponds to the orbits of $(0, 0, 0)$ and $(3, 3, 3)$. In order to study $b_M(k)$ both theoretically and numerically, they give an explicit reduction (descent) for the action of $\Gamma$ on $V_{k,M}(\mathbb{Z})$. For this purpose, it is convenient to remove an explicit set of special admissible $k$'s, namely those for which there is a point in $V_{k,M}(\mathbb{Z})$ with $|x_j| = 0$, 1 or 2. These $k$'s take the form (i) $k = u^2 + v^2$, (ii) $4(k - 1) = u^2 + 3v^2$, or (iii) $k = 4 + u^2$. The number of these special $k$'s (referred to as \textit{exceptional}) with $0 \leq k \leq K$ is asymptotic to $C' \frac{K}{\sqrt{\log K}}$. The remaining admissible $k$'s are called $\textit{generic}$ (all negative admissible $k$'s are generic). For them Ghosh and Sarnak give the following elegant reduced forms:

$\textbf{Proposition 3:}$

  •  Let $k \geq 5$ be generic and consider the compact set

    \[    \tilde{S}_k^+ = \{u \in \mathbb{R}^3 : 3 \leq u_1 \leq u_2 \leq u_3,  u_1^2 + u_2^2 + u_3^2 + u_1 u_2 u_3 = k\}.\]
    The points in $\tilde{S}_k^+(\mathbb{Z}) = \tilde{S}_k^+ \cap \mathbb{Z}^3$ are $\Gamma$-inequivalent, and any $x \in V_{k,M}(\mathbb{Z})$ is $\Gamma$-equivalent to a unique point $u' = (-u_1, u_2, u_3)$ with $u = (u_1, u_2, u_3) \in \tilde{S}_k^+(\mathbb{Z})$.

 Let $k < 0$ be admissible and consider the compact set
    \[\tilde{S}_k^- = \{u \in \mathbb{R}^3 : 3 \leq u_1 \leq u_2 \leq u_3 \leq \frac{1}{2} u_1 u_2,  u_1^2 + u_2^2 + u_3^2 - u_1 u_2 u_3 = k\}.\]
    The points in $\tilde{S}_k^-(\mathbb{Z}) = \tilde{S}_k^- \cap \mathbb{Z}^3$ are $\Gamma$-inequivalent, and any $x \in V_{k,M}(\mathbb{Z})$ is $\Gamma$-equivalent to a unique point $u = (u_1, u_2, u_3) \in \tilde{S}_k^-(\mathbb{Z})$.
\end{enumerate}

Some consequences of this are as follows: As $k \to \pm \infty$, we have \[b_M(k) \ll_{\epsilon} |k|^{\frac{1}{3}+\epsilon}.\]

Let $b_M^\pm(k) = |\delta_k^\pm(\mathbb{Z})|$ where $\pm = \operatorname{sgn}(k)$, this being defined for any $k$. Then for generic $k$, $b_M^\pm(k) = b_M(k)$ while otherwise $b_M(k) \leq b_M^\pm(k)$. We have
\[\sum_{\substack{k \neq 4 \\ |k| \leq K}} b_M^\pm(k) \sim C^\pm K (\log K)^2\]
where $C^\pm > 0$ and $K \to \infty$. So, on average, the numbers $b_M(k)$ are small. 

  1. $\textbf{Theorem 4:}$There are infinitely many Hasse failures. More precisely, the number of $0 < k \leq K$ and $-K \leq k < 0$ for which the Hasse principle fails is at least $\sqrt{K} (\log K)^{-\frac{1}{4}}$ for $K$ large.
  2.  Fix $t \geq 0$. Then as $K \to \infty$,

    \[    \#\{|k| \leq K : k \text{ admissible},  \mathfrak{h}_M(k) = 0\} = o(K).\]

Hasse failures are produced by an obstruction via quadratic reciprocity, much like the breakthrough showing that the local-global conjecture for Apollonian circle packings is false. 

Local to global lifting 

 
We now get to my favourite result out of all the gems here: a solution over $\mathbb{F}_p$ to the original Markov equation lifts to an integral solution. Using $\mathcal{G}_p$ to denote the Markov graph over $\mathbb{F}_p$, the statement that all solutions lift can be written as '$\mathcal{G}_p$ is connected for any prime $p$'. It was first shown for sufficiently large primes $p$ by William Chen using formidable tools of algebraic geometry. He deduces that the number of vertices in a connected component of $\mathcal{G}_p$ is divisible by $p$ as a corollary of the much more powerful machinery he builds. Combined with

$\textbf{Theorem 5:}$[Bourgain--Gamburd--Sarnak] Fix $\varepsilon>0$. For sufficiently large $p$, there is a connected component $\mathcal{C}_p$ of $\mathcal{G}_p$ for which $|\mathcal{G}_p \backslash \mathcal{C}_p| < p^\varepsilon$. 

This shows lifting for all but finitely many primes: suppose $\mathcal{G}_p$ has a connected component disjoint from $\mathcal{C}_p$. Denoting this by $\mathcal{C}'_p$, we have $|\mathcal{C}'_p|<p^\varepsilon$ for some $\varepsilon$ provided $p$ large enough (depending on $\varepsilon$). On the other hand, Chen's theorem says $|\mathcal{C}_p'|\geq p$. So setting $\varepsilon = 1$ provides a contradiction for large $p$.

However, after that an ingenious elementary proof of that the connected components have size divisible by $p$ was given by Daniel Martin, and it is this proof that I record here. It is remarkable in how simple and elementary it is.

$\textbf{Proof of divisibility:}$ It is easy to check the cases $p=2, 3$. Fix a prime $p>3$. To each nonzero $\mathbf{x} = (x_1,x_2,x_3)\in\mathbb{F}_p^3$ that solves the Markov equation, we associate a new triple $\mathbf{y} = (y_1,y_2,y_3)$ defined by 
\[\mathbf{y} = \begin{cases}\left(\tfrac{x_1}{3x_2x_3},\tfrac{x_2}{3x_1x_3},\tfrac{x_3}{3x_1x_2}\right) & x_1x_2x_3\neq 0\\ \left(0,\tfrac{1}{2},\tfrac{1}{2}\right) & x_1 = 0\\ \left(\tfrac{1}{2},0,\tfrac{1}{2}\right) & x_2 = 0\\ \left(\tfrac{1}{2},\tfrac{1}{2},0\right) & x_3 = 0.\end{cases} \tag{1}\]. 
By scaling either side of the Markov equation by $\tfrac{1}{3x_1x_2x_3}$ when $x_1x_2x_3\neq 0$, we see that \[y_1+y_2+y_3=1\tag{2}\] for any $\mathbf{y}$ associated to a nonzero Markov triple. 

Now, consider two triples $\mathbf{x} = (x_1,x_2,x_3)$ and $\mathbf{x}' = (x_1',x_2',x_3')$ that are connected by an edge in $\mathbf{G}_p$ from, say, the first-coordinate Vieta involution. In particular, $x_2'=x_2$ and $x_3'=x_3$. If $x_2x_3\neq 0$, we see from the first two cases of (1) that $y_1=\tfrac{x_1}{3x_2x_3}$ regardless of whether $x_1=0$. Also $y_1'=\tfrac{x_1'}{x_2x_3}$, so the Vieta equation gives 
\[y_1+y_1' = \frac{x_1}{3x_2x_3} + \frac{x_1'}{3x_2x_3} = \frac{x_1}{3x_2x_3} + \frac{3x_2x_3 - x_1}{3x_2x_3} = 1\tag{3}\].

Similarly (and still assuming that $\mathbf{x}$ and $\mathbf{x}'$ are connected by the first-coordinate Vieta involution) if $x_2 = x_2' = 0$ we see from (1) that 
\[y_1+y_1' = \tfrac{1}{2}+\tfrac{1}{2} = 1. \tag{4}\]
Of course, the same holds if $x_3 = 0$. In any case, $y_1+y_1' = 1$. Thus, if $\mathbf{x}\neq \mathbf{x}'$, then $y_1+y_1'$ counts half the number of vertices connected by this particular edge. Similarly, if $\mathbf{x}=\mathbf{x}'$ then $y_1=y_1'$, so (3) and (4) show that $y_1=\tfrac{1}{2}$; again half the number of vertices connected by the edge. We conclude that if $\mathbf{C}$ is a subgraph of $\mathbf{G}_p$ that contains either both or neither of any two (not necessarily distinct) vertices connected by a first-coordinate Vieta involution, we have \[\sum_{\mathbf{x}\in\mathbf{C}}y_1 = \tfrac{1}{2}|\mathbf{C}|, \tag{5}\]
where $|\mathbf{C}|$ is reduced $\text{mod}\,p$.

Now suppose $\mathbf{C}\subset\mathbf{G}_p$ is a connected component. Then $\mathbf{C}$ contains all three neighbouring vertices for any $\mathbf{x}\in\mathbf{C}$, so (5) holds, and it holds just as well for the second or third coordinate. Thus, \[|\mathbf{C}| = \sum_{\mathbf{x}\in\mathbf{C}}1 = \sum_{\mathbf{x}\in\mathbf{C}}\sum_{i=1}^3y_i\text{by (2)}\\ 
=\sum_{i=1}^3\sum_{\mathbf{x}\in\mathbf{C}}y_i = \sum_{i=1}^3\tfrac{1}{2}|\mathbf{C}|\text{by (5)}\\ 
=\tfrac{3}{2}|\mathbf{C}|.\]
Comparing either end, we see that $\tfrac{1}{2}|\mathbf{C}| \equiv 0\,\text{mod}\,p$, which completes the proof. $\blacksquare$

Comments

Popular Posts