### Solving a quartic equation

###### 2013-12-15

This is a method of solving the general quartic equation that my father showed me. The result is a very nice, compact and symmetric expression for the roots.

## Problem

#### Solve the quartic equation

$a{X}^{4}+b{X}^{3}+c{X}^{2}+dX+e=0$

## Solution

First divide through by $a$ and then substitute $X=x-\frac{b}{4a}$ in order to eliminate the cubic term. This leaves us with

$x^4+px^2+qx+r=0.\tag{depressed-quartic}$

Now, introduce a new variable $y={x}^{2}+p\mathrm{/}2$. Then our quartic is equivalent to the pair of simultaneous equations: \begin{aligned} y^2+qx+\left(r-\frac{p^2}{4}\right)&=0\tag{parabolas}
x^2-y+\frac{p}{2}&=0. \end{aligned}

These two equations each describe parabolas in the $xy$-plane, one oriented vertically and one oriented horizontally. The intersections (there can be up to four) correspond to roots of our quartic (at least, the $x$-coordinates of those intersections do). Finding those intersections is still very hard.

There is a great trick to move forward: we consider the problem in a more general light. Define ${f}_{m}\left(x,y\right)=\left({y}^{2}+qx+\left(r-\frac{{p}^{2}}{4}\right)\right)+m\left({x}^{2}-y+\frac{p}{2}\right)\mathrm{.}$

If we consider a solution $\left(x,y\right)$ of our simultaneous equations, then ${f}_{m}\left(x,y\right)=0$ for every $m$ (of course, ${f}_{m}$ will generally have other zeroes in addition to those). In other words, every curve in the family $\mathcal{F}=\left\{\left(x,y\right)\in {\mathbb{R}}^{2}\mid {f}_{m}\left(x,y\right)=0{\right\}}_{m\in \mathbb{R}}$ will contain our points of interest.

Now, the equation ${f}_{m}\left(x,y\right)=0$

describes a conic section in the $xy$-plane. Completing the square, we find ${\left(y-\frac{m}{2}\right)}^{2}+m{\left(x+\frac{q}{2m}\right)}^{2}=\left(\frac{{q}^{2}}{4m}+\frac{1}{4}\left(p-m{\right)}^{2}-r\right)\mathrm{.}$

This conic changes its character as $m$ is varied, from ellipse to hyperbola, etc. However, every conic in this family $\mathcal{F}$ contains the points where the parabolas \eqref{parabolas} intersect.

To find these points common to the family of conics, we use another great trick. There are a handful of particularly simple conic sections that can be picked out of the family $\mathcal{F}$: when the right hand side vanishes, the conic sections are simply lines! The condition for this is

$m^3-2pm^2+(p^2-4r)m+q^2=0,\tag{cubic}$

and the resulting conics are ${\left(y-\frac{m}{2}\right)}^{2}=-m{\left(x+\frac{q}{2m}\right)}^{2}$

or $y=\frac{m}{2}±\sqrt{-m}\left(x+\frac{q}{2m}\right)\mathrm{.}$

We will call the three roots of \eqref{cubic} ${m}_{1}$, ${m}_{2}$ and ${m}_{3}$. For each of those values of $m$, the conic is a pair of intersecting lines which, again, always contain the points we care about.

Out of those three special values of $m$, we will take ${m}_{1}$ and ${m}_{2}$ and consider the intersections of those conics. In a normal, non-degenerate case, we expect four intersections: line $1$ of the ${m}_{1}$ conic will hit each of the two lines of the ${m}_{2}$ conic; line $2$ of the ${m}_{1}$ conic will do the same. Of course, working with lines, it is easy to find the intersections.

For example, considering the intersection of the $+$ branch for ${m}_{1}$ with the $+$ branch of ${m}_{2}$, we write $\frac{{m}_{1}}{2}+\sqrt{-{m}_{1}}\left({x}_{++}+\frac{q}{2{m}_{1}}\right)=\frac{{m}_{2}}{2}+\sqrt{-{m}_{2}}\left({x}_{++}+\frac{q}{2{m}_{2}}\right)\mathrm{.}$

Rearranging, ${x}_{++}\left(\sqrt{-{m}_{1}}-\sqrt{-{m}_{2}}\right)=-\frac{1}{2}\left({m}_{1}-{m}_{2}\right)-\frac{q}{2}\frac{\sqrt{-{m}_{1}}-\sqrt{-{m}_{2}}}{\sqrt{{m}_{1}{m}_{2}}}\mathrm{.}$

This simplifies if we expand ${m}_{1}-{m}_{2}$ as a difference of squares. Then we get ${x}_{++}=-\frac{1}{2}\left(\sqrt{-{m}_{1}}+\sqrt{-{m}_{2}}+\frac{q}{\sqrt{{m}_{1}{m}_{2}}}\right)\mathrm{.}$

The other three solutions, ${x}_{+-}$, ${x}_{-+}$ and ${x}_{--}$ all look similar.

Another simplification comes if we recall that $\left(m-{m}_{1}\right)\left(m-{m}_{2}\right)\left(m-{m}_{3}\right)={m}^{3}-2p{m}^{2}+\left({p}^{2}-4r\right)m+{q}^{2},$

so that ${q}^{2}=-{m}_{1}{m}_{2}{m}_{3}$. If we assume that $q>0$, then $q=\sqrt{-{m}_{1}{m}_{2}{m}_{3}}$, we find a very nice form for the four solutions: $\begin{gathered} x_{++}=\frac{1}{2}\left(-\sqrt{-m_1}-\sqrt{-m_2}-\sqrt{-m$3}\right),
x
{+-}=\frac{1}{2}\left(-\sqrt{-m_1}+\sqrt{-m_2}+\sqrt{-m3}\right),
x
{-+}=\frac{1}{2}\left(+\sqrt{-m_1}-\sqrt{-m_2}+\sqrt{-m3}\right),
x
{–}=\frac{1}{2}\left(+\sqrt{-m_1}+\sqrt{-m_2}-\sqrt{-m_3}\right). \end{gathered}

Thus the roots of the depressed quartic \eqref{depressed_quartic} are expressed as symmetric combinations of the roots of the cubic \eqref{cubic}. They are symmetric in the sense that the arbitrary numbering “${m}_{1}$m1”, etc. is irrelevant: ${x}_{++}$ is independent of any numbering scheme, while the other three solutions each simply single out one of three $m$ values to have a negative sign.

We can verify that these solutions are correct by checking that $\left(x-{x}_{++}\right)\left(x-{x}_{-+}\right)\left(x-{x}_{+-}\right)\left(x-{x}_{--}\right)={x}^{4}+p{x}^{2}+qx+r\mathrm{.}$

Expanding, and making reference to a similar expansion of \eqref{cubic}, we see that \begin{aligned} \sum_i xi=&0,
\sum
{i<j} x_i x_j=\frac{1}{2}(m_1+m_2+m3)=&p,
\sum
{i<j<k} x_i x_j x_k=-\sqrt{-m_1 m_2 m_3}=-\sqrt{q^2}=&-q,
\prod_i x_i=\frac{1}{16}\left(\left(m_1+m_2+m_3\right)^2-4(m_1 m_2+m_1 m_3+m_2 m_3)\right)=&r. \end{aligned}

Therefore the roots of the quartic may be expressed as $\begin{gathered} X_1=-\frac{b}{4a}+\frac{1}{2}\left(-\sqrt{-m_1}-\sqrt{-m_2}-\sqrt{-m_3}\right),$
X_2=-\frac{b}{4a}+\frac{1}{2}\left(-\sqrt{-m_1}+\sqrt{-m_2}+\sqrt{-m_3}\right),
X_3=-\frac{b}{4a}+\frac{1}{2}\left(+\sqrt{-m_1}-\sqrt{-m_2}+\sqrt{-m_3}\right),
X_4=-\frac{b}{4a}+\frac{1}{2}\left(+\sqrt{-m_1}+\sqrt{-m_2}-\sqrt{-m_3}\right). \end{gathered}

with ${m}_{1},{m}_{2},{m}_{3}$ the roots of the cubic \eqref{cubic}.

## What if $q<0$?

If we back up a bit, we see that it was important, in the final expressions, that $q>0$. However, the actual expression for $q$, coming from \eqref{depressed-quartic} is $q=\frac{d}{a}+\frac{{b}^{3}}{8{a}^{3}}-\frac{bc}{2{a}^{2}}\mathrm{.}$

This could very easily be negative. Therefore we must consider the case separately.

We now have $q=-\sqrt{-{m}_{1}{m}_{2}{m}_{3}}$ and the solutions become $\begin{gathered} x'_{++}=\frac{1}{2}\left(-\sqrt{-m_1}-\sqrt{-m_2}+\sqrt{-m$3}\right),
x'
{+-}=\frac{1}{2}\left(-\sqrt{-m_1}+\sqrt{-m_2}-\sqrt{-m3}\right),
x'
{-+}=\frac{1}{2}\left(+\sqrt{-m_1}-\sqrt{-m_2}-\sqrt{-m3}\right),
x'
{–}=\frac{1}{2}\left(+\sqrt{-m_1}+\sqrt{-m_2}+\sqrt{-m_3}\right). \end{gathered}

Note that ${x}_{++}^{\mathrm{\prime }}=-{x}_{--}$, etc. These are the negatives of the roots from the case where $q>0$. However, they are, indeed, different values from before.

In verifying the expansion for \eqref{depressed-quartic}, we find that the condition for the coefficient of $x$ changes slightly so that the $q<0$ solutions check out.

TODO: numerically evaluating this stuff in Mathematica indicates there are issues. for some parameters the solution is right, but not as often as I’d hope. the issues are probably due to being too loose with radicals of negative quantities. It’s standard for some of the $m$’s to be positive. $\sqrt{-{m}_{1}}\sqrt{-{m}_{2}}$ is not generally $\sqrt{{m}_{1}{m}_{2}}$.