Newton polygons

September 06, 2025

Newton polygons

This post mostly reproduces the answer here. For background on local fields, see e.g. J.S. Milne's notes. We discuss the Newton polygon, which looks a lot like the Newton polytope previously mentioned here. We will need to write down a lot of definitions.

We say a set $S \subseteq \mathbb{R}^2$ is lower convex if it is convex and whenever $(x, y) \in S$, then $(x, z) \in S$ for all $z \geq y$. Given any set of points $T \subseteq \mathbb{R}^2$, there is a minimal lower convex set $S \supseteq T$ (by the intersection of all lower convex sets containing $T$ -- this is a non-empty definition because $\mathbb{R}^2$ satisfies the property). This is known as the lower convex hull of the points.

Let $f(x) = a_0 + a_1 x + \cdots + a_n x^n \in K[x]$, where $(K, v)$ is a valued field. Then the Newton polygon of $f$ is the lower convex hull of $\{(i, v(a_i)): i = 0, \cdots, n, a_i \not= 0\}$. Given a polynomial, the points $(i, v(a_i))$ lying on the boundary of the Newton polygon are known as the break points. The length or multiplicity of a line segment is the horizontal length.

An informal description can be given as follows: consider a rubber band surrounding all the points $P_i$. Stretch the band upwards, such that the band is stuck on its lower side by some of the points (the points act like nails, partially hammered into the xy plane). The vertices of the Newton polygon are exactly those points. It turns out the Newton polygon tells us something about the roots of the polynomial.

Theorem 1 Let $K$ be complete valued field, and $v$ the valuation on $K$. We let
\[   f(x) = a_0 + a_1 x + \cdots + a_n x^n \in K[x]. \]
Let $L$ be the splitting field of $f$ over $K$, equipped with the unique extension $w$ of $v$. If $(r, v(a_r)) \to (s, v(a_s))$ is a line segment of the Newton polygon of $f$ with slope $-m \in \mathbb{R}$, then $f$ has precisely $s - r$ roots of valuation $m$.

Note that by lower convexity, there can be at most one line segment for each slope. So this theorem makes sense.

Proof Dividing by $a_n$ only shifts the polygon vertically, so we may wlog $a_n = 1$. We number the roots of $f$ such that

\[    w(\alpha_1) = \cdots = w(\alpha_{s_1}) = m_1\]
\[    w(\alpha_{s_1 + 1}) = \cdots = w(\alpha_{s_2}) = m_2 \cdots\]
\[ w(\alpha_{s_t}) = \cdots = w(\alpha_n) = m_{t + 1} \]
where we have
\[    m_1 < m_2 < \cdots < m_{t + 1} \]
Then we know

\[    v(a_n) = v(1) = 0\]
   \[ v(a_{n - 1}) = w\left(\sum \alpha_i\right) \geq \min_i w(\alpha_i) = m_1\]
\[    v(a_{n - 2}) = w\left(\sum \alpha_i\alpha_j\right) \geq \min_{i \not= j} w(\alpha_i \alpha_j) = 2 m_1\]

\[ v(a_{n - s_1}) = w\left(\sum_{i_1\not=\ldots \not=i_{s_1}} \alpha_{i_1 \ldots \alpha_{i_{s_1}}}\right) = \min w(\alpha_{i_1}\cdots\alpha_{i_{s_1}}) = s_1 m_1.\]
It is important that in the last one, we have equality, not an inequality, because there is one term in the sum whose valuation is less than all the others.

We can then continue to get
\[ v(\alpha_{n - s_1 - 1}) \geq \min w(\alpha_{i_1} \cdots \alpha_{i_{s_1 + 1}}) = s_1 m_1 + m_2, \]
until we reach
\[ v(\alpha_{n - s_1 - s_2}) = s_1 m_1 + (s_2 - s_1) m_2. \]
We keep going. We don't know where exactly the other points are, but the inequalities imply that the $(i, v(a_i))$ are above the lines drawn. So this is the Newton polygon.
Counting from the right, the first line segment has length $n - (n - s_1) = s_1$ and slope
\[    \frac{0 - s_1 m_1}{n - (n - s_1)} = -m_1.\]
In general, the $k$th segment has length $(n - s_{k - 1}) - (n - s_k) = s_k - s_{k - 1}$, and slope

    \[\frac{\left(s_1 m_1 + \sum_{i = 1}^{k - 2} (s_{i + 1} - s_i) m_{i + 1}\right) - \left(s_1 m_1 + \sum_{i = 1}^{k - 1} (s_{i + 1} - s_i) m_{i + 1}\right)}{s_k - s_{k - 1}} \]
   \[ = \frac{-(s_k - s_{k - 1})m_k}{s_k - s_{k - 1}} = - m_k.\]

and the others follow similarly. $\blacksquare$

Corollary 2 If $f$ is irreducible, then the Newton polygon has a single line segment.

Proof We need to show that all roots have the same valuation. Let $\alpha, \beta$ be in the splitting field $L$. Then there is some $\sigma \in Aut(L/K)$ such that $\sigma(\alpha) = \beta$. Then $w(\alpha) = w(\sigma(\alpha)) = \beta$. $\blacksquare$

It is an exercise to deduce Eisenstein's criterion from this.

Corollary 3 Assume that $(K, v_K)$ is Henselian. Then
\[f = A f_1 f_2 \cdots f_r,\]
where $A \in K$, $f_i \in K[X]$ is monic for every $i$, the roots of $f_i$ are of valuation $-\mu_i$, and
\[\deg(f_i) = \lambda_i.\] Moreover,
\[\mu_i = \frac{v_K(f_i(0))}{\lambda_i},\]
and if $v_K(f_i(0))$ is coprime to $\lambda_i$, then $f_i$ is irreducible over $K$.

Proof For every $i$, denote by $f_i$ the product of the monomials $(X - \alpha)$ such that $\alpha$ is a root of $f$ and $v_L(\alpha) = -\mu_i$. We also denote
\[
f = A P_1^{k_1} P_2^{k_2} \cdots P_s^{k_s}
\]
as the factorization of $f$ in $K[X]$ into prime monic factors ($A \in K$). Let $\alpha$ be a root of $f_i$. We can assume that $P_1$ is the minimal polynomial of $\alpha$ over $K$. If $\alpha'$ is a root of $P_1$, there exists a $K$-automorphism $\sigma$ of $L$ that sends $\alpha$ to $\alpha'$, and we have
\[
v_L(\sigma \alpha) = v_L(\alpha),
\]
since $K$ is Henselian. Therefore $\alpha'$ is also a root of $f_i$. Moreover, every root of $P_1$ of multiplicity $\nu$ is clearly a root of $f_i$ of multiplicity $k_1 \nu$, since repeated roots obviously share the same valuation. This shows that $P_1^{k_1}$ divides $f_i$.

Let $g_i = \frac{f_i}{P_1^{k_1}}$. Choose a root $\beta$ of $g_i$. Notice that the roots of $g_i$ are distinct from the roots of $P_1$. Repeat the previous argument with the minimal polynomial of $\beta$ over $K$, assumed w.l.o.g. to be $P_2$, to show that $P_2^{k_2}$ divides $g_i$. Continuing this process until all the roots of $f_i$ are exhausted, one eventually arrives at
\[
f_i = P_1^{k_1} \cdots P_m^{k_m}, \quad \text{with } m \leq s.
\]
This shows that $f_i \in K[X]$, and $f_i$ is monic. But the $f_i$ are coprime since their roots have distinct valuations. Hence clearly
\[
f = A f_1 \cdot f_2 \cdots f_r,
\]
showing the main claim. The fact that $\lambda_i = \deg(f_i)$ follows from the main theorem, and so does the fact that
\[
\mu_i = \frac{v_K(f_i(0))}{\lambda_i},
\]
by noting that the Newton polygon of $f_i$ can have only one segment joining $(0, v_K(f_i(0)))$ to $(\lambda_i, 0 = v_K(1))$. The condition for the irreducibility of $f_i$ follows from the corollary above. $\blacksquare$

The following is an immediate corollary of the factorization above and constitutes a test for the reducibility of polynomials over Henselian fields:

Corollary 4 Assume that $(K, v_K)$ is Henselian. If the Newton polygon does not reduce to a single segment $(\mu, \lambda)$, then $f$ is reducible over $K$.

All of this is already quite powerful but what is even more remarkable about the Newton polygon is the fact that you can figure out the Newton polygon of a product of polynomials from the Newton polygon of the individual polynomials. The reason why this is true is that the coefficients of the product can be easily described in terms of the coefficients of the individual polynomials, and then one just needs to do a little bit of hard work figuring out the details.

Ostrowski's Lemma: Let $f, f'$ be $p$-adic polynomials with slope sequences $s_1, \ldots, s_n$ and $t_1, \ldots, t_m$ respectively. Then the slope sequence of $ff'$ is the sequence $s_1, \ldots, s_n, t_1, \ldots, t_m$ sorted in ascending order.

We now use this to do an Olympiad problem, because the intended solution was clearly Newton polygons. Suppose that $f\in \mathbb{Q}[X]$ satisfies that $f\circ f$ and $f\circ f \circ f$ are both integer polynomials. Then $f$ must be an integer polynomial too.

Lemma 5: Let $f \in \mathbb{Q}_p[x]$ not have all its coefficients in $\mathbb{Z}_p$, but have the constant term $f_0 \in \mathbb{Z}_p$. Then $f \circ f$ cannot have all its coefficients in $\mathbb{Z}_p[x]$.

Proof: Note that when we write $ f = f_0 + \dots + f_d x^d $, we get
\[
f \circ f = f_0 + \dots + f_{d+1} x^{d^2}.
\]
So the leading term is $ f_{d+1} x^{d^2} $. Clearly, if $ f_d \notin \mathbb{Z}_p $, then $ f_{d+1} \notin \mathbb{Z}_p $, and we are done. Therefore we assume that $ f_d \in \mathbb{Z}_p $.

What can we infer about the Newton polygon of $ f $? Well, $ \nu(f_0), \nu(f_d) \geq 0 $, but by assumption one of $ f $'s coefficients is not an integer, so $ \nu(f_j) < 0 $ for some $ j $. Thus, the Newton polygon of $ f $ strictly dips below zero to accommodate $ (j, \nu(f_j)) $, and then recovers. This means that the polygon slopes are first negative and then positive: we can order them as
\[
s_1 \leq s_2 \leq \dots \leq s_k < 0 < s_{k+1} \leq \dots \leq s_d.
\]
Then $ (k, N_k) $ is the lowermost point of the Newton polygon. Note that $ N_k < 0 $ since it is the lowermost point and we know that the polygon comes below zero. We keep track of this fact.

By theorem 1, label the roots of $ f $ so that $ \nu(\theta_j) = -s_j $. Observe that
\[
f(x) = f_d \prod_j (x - \theta_j) \Rightarrow f \circ f(x) = f_d \prod_j (f(x) - \theta_j),
\]
and therefore $ f \circ f(x) $ has been expressed as a product of polynomials, whose Newton polygons can be ascertained. We are firmly in OL territory. However, we must break into two cases, for $ j $ being on the left and right hand side of $ k $ on the polygon.

The strategy is, essentially, $ f \circ f \notin \mathbb{Z}_p[x] $ means that the Newton polygon of $ f \circ f $ dips below the $ x $-axis somewhere, since wherever it dips, the coefficient can't be a $ p $-adic integer. So we need to study height changes among the slopes of the polygons of $ f(x) - \theta_j $ and transfer them to that of $ f \circ f $ using Ostrowski.

If $ 1 \leq j \leq k $, then $ \nu(\theta_j) \geq 0 $. The constant term of $ f - \theta_j $ is $ f_0 - \theta_j $, which by the ultrametric nature of the valuation will have a non-negative valuation. The other coefficients don't change. In particular, the increasing part of the slope sequence of $ f $ is retained by $ f - \theta_j $, and remains $ (s_{k+1}, \dots, s_d) $. Thus, the height change of the Newton polygon of $ f - \theta_j $ between $ (k, N_k) $ and $ (d, N_d) $ is \[ s_{k+1} + \dots + s_d = N_d - N_k.\]
If $ k+1 \leq j \leq d $, then $ \nu(\theta_j) < 0 $. The constant term of $ f - \theta_j $ is \[ \nu(f - \theta_j) = \nu(\theta_j) = -s_j < 0 \]

by the ultrametric property of the valuation. The right-hand endpoint is $ (d,N_d) $. Thus, the height change of the Newton polygon of $ f - \theta_j $ between $ (k, N_k) $ and $ (d, N_d) $ is at least $ N_d + s_j $, since the polygon will dip at $ (0, -s_j) $ and then go up. It turns out the estimate is sufficient.

Now, we are trying to find the dip of $ f \circ f $ from the right endpoint, and if this dip is more than the right endpoint's $ y $-coordinate, we are done. By OL, we know that all the slopes of $ f(x) - \theta_j $ are to be sorted into increasing order, and for finding the lowest point of the Newton polygon of $ f \circ f $, we need to find the non-negative slopes and add them together to get the total drop.

What are those non-negative slopes? Well, there's those from $ 1 \leq j \leq k $, giving a total drop by $ N_d - N_k $, and there are $ k $ of those, giving a combined drop of
\[
k(N_d - N_k)
\]
from these cases. For $ k+1 \leq j \leq d $, we have a drop of at least $ N_d + s_j $ in each case. Finally, the total drop is at least
\[
k(N_d - N_k) + \sum_{j = k+1}^{d} (N_d + s_j).
\]

Recall that the leading term of $ f \circ f $ is $ f_{d+1} x^{d^2} $. Using the multiplicative property of the valuation, we get that the right endpoint of the Newton polygon of $ f \circ f $ is
\[
(d^2, \nu(f_{d+1})) = (d^2, N_d(d+1)).
\]

\Exercise: Show that
\[
k(N_d - N_k) + \sum_{j = k+1}^d (N_d + s_j) - N_d(d+1) = -(k+1)N_k,
\]
which is positive since $N_k < 0$. Therefore, the Newton polygon of $f \circ f$ dips below the x-axis somewhere. $\blacksquare$

Lemma 6 Let $g$ have coefficients in $\mathbb{Q}_p$ and suppose $g \circ g, g \circ g \circ g$ have coefficients in $\mathbb{Z}_p$. Then $g$ has coefficients in $\mathbb{Z}_p$.
Proof: Note that $ g(g(0)), g(g(g(0))) \in \mathbb{Z}_p $, these are the constant terms of the respective polynomials. Let
\[
f(x) = g(x + g(g(0))) - g(g(0)).
\]
Now, note that
\[
f \circ f(x) = g(g(x + g(g(0)))) - g(g(0)) = g \circ g(x + g \circ g(0)) - g \circ g(0),
\]
which is a difference of polynomials in $ \mathbb{Z}_p[x] $, hence belongs in $ \mathbb{Z}_p[x] $. Finally,
\[
f(0) = g \circ g \circ g(0) - g \circ g(0),
\]
hence $ f(0) \in \mathbb{Z}_p $. By the contrapositive of the main lemma, we get that $ f $ has coefficients in $ \mathbb{Z}_p $ (if not, then the main lemma would be contradicted).

But then $ g $ has coefficients in $ \mathbb{Z}_p $ too, from the following logic: if $ f $ has coefficients in $ \mathbb{Z}_p $, then so does
\[
f - g(g(0)).
\]
But then this is just a $ g(g(0)) $ shift of the polynomial $ g(x) $, and any $ \mathbb{Z}_p $-integer shift retains coefficient membership in $ \mathbb{Z}_p $ from a simple binomial theorem type argument and the ultrametric nature of the valuation. $\blacksquare$

It is now easy to conclude: being in $\mathbb{Z}_p$ for all primes $p$ implies being in $\mathbb{Z}$, and this applies to all coefficients, so we are done.

Search This Blog

On the shoulders of giants

Newton polygons

Comments

Post a Comment

Popular Posts

JSJ I: History and Motivation

Finite Simple Groups