This page covers the supplementary problems at the end of chapter 3.

The problems below are paraphrased from/inspired by those given in Topics in Algebra by Herstein. The solutions are my own unless otherwise noted. I will generally try, in my solutions, to stick to the development in the text. This means that problems will not be solved using ideas and theorems presented further on in the book.

Let $a,b\in P$ be such that $ab+P=(a+P)(b+P)=0+P\mathrm{.}$

$R\mathrm{/}P$ is an integral domain is equivalent to the statement that $ab\in P$ implies $a\in P$ or $b\in P$, which is equivalent to the statement that $P$ is a prime ideal.

Theorem 3.5.1 states that $M$ is a maximal ideal of commutative, unital ring $R$ if and only if $R\mathrm{/}M$ is a field. Therefore if $M$ is a maximal ideal of $R$, we have $R\mathrm{/}M$ a field and thus an integral domain. By exercise 3.1, $M$ is a prime ideal.

The ring $R$ cannot be a PID, because a prime ideal will always be maximal in that case: if $P=\left(p\right)$ is a prime ideal, then an ideal $I=\left(r\right)$ containing it will have $r\mid p$ which won’t happen in any non-trivial way. Then we avoid PIDs and look for things that are somewhat more exotic, but non-commutative rings are probably too exotic. A natural choice for investigation are polynomial rings.

One thought is something like $({x}^{2}+1)$ in $\mathbb{R}\left[x\right]$. This is a prime ideal but it’s also relatively easy to see that it is maximal. However, if we consider instead $I=({x}^{2}+2)$ in $R=\mathbb{Z}\left[x\right]$, we have it. The generator ${x}^{2}+2$ is irreducible so, because $\mathbb{Z}\left[x\right]$ is a UFD, the ideal is prime. The ideal $J=({x}^{2},2)=\{2m+n{x}^{2}\mid m,n\in \mathbb{Z}[x\left]\right\}$

properly contains it because it contains elements (e.g. $2$) with degree smaller than $2$. However, $J\mathrm{\ne}R$ because $1$ is clearly not in $J$. Therefore $I$ is prime but not maximal in $R=\mathbb{Z}\left[x\right]$.

Note that $I=(x+2)$ and $J=(x,2)$ would work via the same argument.

Let $P$ be a prime ideal of $R$. We have by exercise 3.1 that $R\mathrm{/}P$ is an integral domain, and by lemma 3.2.2 that a finite integral domain must be a field. Therefore $R\mathrm{/}P$ is a field, so that theorem 3.5.1 gives us that $P$ is a maximal ideal of $R$.

The meaning of this exercise is not clear. It seems that we just want to show that the name of the indeterminate is irrelevant. This can be done by considering the obvious mapping $\varphi :F\left[x\right]\to F\left[t\right]$ with $\varphi ({a}_{0}+{a}_{1}x+\cdots +{a}_{n}{x}^{n})={a}_{0}+{a}_{1}t+\cdots +{a}_{n}{t}^{n}\mathrm{.}$

Then $\varphi \left(\right(fg\left)\right(x\left)\right)=\left(fg\right)\left(t\right)=f\left(t\right)g\left(t\right)=\varphi \left(f\right(x\left)\right)\varphi \left(g\right(x\left)\right)$

and $\varphi \left(\right(f+g\left)\right(x\left)\right)=(f+g)\left(t\right)=f\left(t\right)+g\left(t\right)=\varphi \left(f\right(x\left)\right)+\varphi \left(g\right(x\left)\right)\mathrm{.}$

Therefore $\varphi $ is a homomorphism, and it is clear that it is both injective and onto.

Let $f\in F\left[x\right]$ be given by $f\left(x\right)={a}_{0}+\cdots +{a}_{n}{x}^{n}$. Consider $\sigma \left(f\right(x\left)\right)=\sigma ({a}_{0}+{a}_{1}x+\cdots +{a}_{n}{x}^{n})={a}_{0}+{a}_{1}\sigma \left(x\right)+\cdots +{a}_{n}\sigma (x{)}^{n}$

which has simplified because $\sigma $ is a homomorphism and fixes the coefficients. From this we see that the $F$-fixing automorphism $\sigma $ is determined entirely by where it maps the polynomial $x=0+1x+0{x}^{2}+\cdots $. If $\mathrm{d}\mathrm{e}\mathrm{g}\left(\sigma \right(x\left)\right)=0$, then the map can not be surjective, because only constants are in the image of $\sigma $. If $\mathrm{d}\mathrm{e}\mathrm{g}\left(\sigma \right(x\left)\right)>1$, then again the map $\sigma $ can not be surjective: the degree of a non-constant polynomial ${a}_{0}+{a}_{1}\sigma \left(x\right)+\cdots +{a}_{n}\sigma (x{)}^{n}$ is $n\cdot \mathrm{d}\mathrm{e}\mathrm{g}\left(\sigma \right(x\left)\right)\ge \mathrm{d}\mathrm{e}\mathrm{g}\left(\sigma \right(x\left)\right)>1$. Hence no polynomials of degree $1$ would be in the image of $\sigma $.

Therefore $\sigma \left(x\right)=\alpha x+\beta $ for some $\alpha ,\beta \in F$ is the only remaining possibility. The question is whether $\sigma \left(x\right)$ so chosen respects the fact that the map must be an automorphism. We have $\sigma \left(f\right(x\left)\right)=\sigma ({a}_{0}+{a}_{1}x+\cdots +{a}_{n}{x}^{n})={a}_{0}+{a}_{1}(\alpha x+\beta )+\cdots +{a}_{n}(\alpha x+\beta {)}^{n}=f(\alpha x+\beta )\mathrm{.}$

This is simply a composition and we easily see that $\sigma $ is still a homomorphism: $\sigma \left(\right(f+g\left)\right(x\left)\right)=(f+g)(\alpha x+\beta )=f(\alpha x+\beta )+g(\alpha x+\beta )=\sigma \left(f\right(x\left)\right)+\sigma \left(g\right(x\left)\right)$

and $\sigma \left(\right(fg\left)\right(x\left)\right)=\left(fg\right)(\alpha x+\beta )=f(\alpha x+\beta )g(\alpha x+\beta )=\sigma \left(f\right(x\left)\right)\sigma \left(g\right(x\left)\right)\mathrm{.}$

Suppose $f,g\in F\left[x\right]$ are such that $\sigma \left(f\right)=\sigma \left(g\right)$, and suppose $\alpha \mathrm{\ne}0$. Then $0=(f-g)(\alpha x+\beta )={b}_{0}+{b}_{1}(\alpha x+\beta )+\cdots +{b}_{m}(\alpha x+\beta {)}^{m}$

where ${b}_{i}$ are the differences of the coefficients of $f$ and $g$, and $m$ is the maximum of the two degrees. Starting from the degree $m$ term, it is clear that ${b}_{m}=0$ because there is no way to cancel the ${x}^{m}$ term otherwise. Next the degree $m-1$ term suffers the same fate, and so on, down the chain. Thus all of the coefficients must vanish identically, so that $\sigma \left(f\right)=\sigma \left(g\right)$ implies $f=g$, i.e. that $\sigma $ is injective. This argument would fail if $\alpha =0$ because there are no powers of $x$ to speak of. However, $\sigma $ is not injective in the $\alpha =0$ case: $\sigma \left(x\right)=\sigma \left(\beta \right)$ and $x\mathrm{\ne}\beta $. Therefore we must restrict $\alpha \mathrm{\ne}0$.

Observe that, if $\sigma \left(f\right(x\left)\right)=f(\alpha x+\beta )$, then $\sigma ({\alpha}^{-1}x-{\alpha}^{-1}\beta )={\alpha}^{-1}(\alpha x+\beta )-{\alpha}^{-1}\beta =x\mathrm{.}$

Therefore $\sigma $ is surjective because ${a}_{0}+{a}_{1}x+\cdots +{a}_{n}{x}^{n}$

is the image under $\sigma $ of ${a}_{0}+{a}_{1}({\alpha}^{-1}x-{\alpha}^{-1}\beta )+\cdots +{a}_{n}({\alpha}^{-1}x-{\alpha}^{-1}\beta {)}^{n}\mathrm{.}$

To summarize, we have shown that if $\sigma $ is an automorphism of $F\left[x\right]$ which fixes the coefficient field $F$, then $\sigma $ must be of the form $\sigma \left(f\right(x\left)\right)=f(\alpha x+\beta )$

where $\alpha ,\beta \in F$ and $\alpha \mathrm{\ne}0$.

If we wanted to replace $F$ by a ring $R$, we note that if $R$ is not an integral domain, then it is conceivable that $\mathrm{d}\mathrm{e}\mathrm{g}\left(\sigma \right(x\left)\right)>1$ would be an acceptable situation, so there might be much more to study here. However, if $R$ is an integral domain, the argument carries through but $\alpha $ must be a unit rather than simply non-zero.

**(a)** If $r,s\in N$ with ${r}^{m}=0$ and ${s}^{n}=0$, then
$(r+s{)}^{m+n}=\sum _{k=0}^{m+n}\left(\genfrac{}{}{0px}{}{m+n}{k}\right){r}^{k}{s}^{m+n-k}$

by the binomial theorem. The idea here is that every term in this sum is zero because either $r$ is raised to a high enough power or $s$ is raised to a high enough power. This is most clear if we change the variable of summation to $\mathrm{\ell}=k-m$ so that $(r+s{)}^{m+n}=\sum _{\mathrm{\ell}=-m}^{n}\left(\genfrac{}{}{0px}{}{m+n}{m+\mathrm{\ell}}\right){r}^{m+\mathrm{\ell}}{s}^{n-\mathrm{\ell}}\mathrm{.}$

Now if $\mathrm{\ell}\ge 0$ we have ${r}^{m+\mathrm{\ell}}=0$, while $\mathrm{\ell}<0$ gives ${s}^{n-\mathrm{\ell}}={s}^{n+\mathrm{\mid}\mathrm{\ell}\mathrm{\mid}}=0$. Hence $(r+s{)}^{m+n}=0$ and so $r+s\in N$ whenever $r,s\in N$.

In addition, if $r\in N$ with ${r}^{m}=0$ and $\alpha \in R$, then $(\alpha r{)}^{m}={\alpha}^{m}{r}^{m}=0$ so $\alpha r\in N$. Therefore, $N$ is an ideal of $R$.

**(b)** If $(r+N{)}^{m}=N$, then ${r}^{m}\in N$. Thus there exists $n$ such that $0=({r}^{m}{)}^{n}={r}^{mn}$, which shows that $r$ also belongs to $N$.

This is reminiscent of $R\mathrm{/}N$ being an integral domain. It’s not that we necessarily have no zero divisors in $R\mathrm{/}N$, but we have no non-trivial nilpotent elements in $R\mathrm{/}N$. Of course, nilpotent elements are a particular class of zero divisor. If $R\mathrm{/}N$ were an integral domain, then we could say by exercise 3.1 that $N$ is a prime ideal. While this might not be true, it is indeed true that the nilradical is related to prime ideals. It turns out that $N$ is the intersection of all prime ideals of $R$ (proof).

**(a)** If $r\in A$ then $r$ is immediately a member of $N\left(A\right)$ because ${r}^{1}\in A$. Thus $A\subset N\left(A\right)$. If $r,s\in N\left(A\right)$ then the proof of exercise 3.7a carries over almost verbatim to show that $r+s\in N\left(A\right)$. The various terms in the binomial expansion do not vanish, but they all belong to $A$, which is an ideal. Hence $r+s\in N\left(A\right)$. If $r\in N\left(A\right)$ with ${r}^{m}\in A$ and $\alpha \in R$, then $(\alpha r{)}^{m}={\alpha}^{m}{r}^{m}\in A$ so that $\alpha r\in N\left(A\right)$. Therefore $N\left(A\right)$ is an ideal containing $A$.

**(b)** Let $r\in N\left(N\right(A\left)\right)$ so that ${r}^{m}\in N\left(A\right)$ for some $m\in \mathbb{Z}$. Then there exists $n\in \mathbb{Z}$ with $({r}^{m}{)}^{n}={r}^{mn}\in A$ and hence $r\in N\left(A\right)$. This shows that $N\left(N\right(A\left)\right)\subset N\left(A\right)$. By part (a), we already know that $N\left(A\right)\subset N\left(N\right(A\left)\right)$. Therefore $N\left(N\right(A\left)\right)=N\left(A\right)$.

The nilradical is the radical of the zero ideal. Note that the parts of exercise 3.7 are the respective special cases of the parts of exercise 3.8, despite the fact that part (b) of exercise 3.7 is expressed somewhat differently.

If $r\in \mathbb{Z}\mathrm{/}n\mathbb{Z}$ is an element of the nilradical, then there exists $m,a\in \mathbb{Z}$ with ${r}^{m}=an$. The prime factors of the left hand side will always be exactly those of $r$. If one of the prime factors of $n$ is missing from $r$, then this equation will have no solution. To be more explicit, say that $n={p}_{1}^{{i}_{1}}\cdots {p}_{k}^{{i}_{k}};$

then we claim that $r$ belongs to the nilradical if and only if $r$ is a multiple of ${p}_{1}\cdots {p}_{k}$. This statement correctly describes even the trivial case of $r=0$, but we exclude that case in what follows.

If $r$ is non-zero and of this form, then take the power $m$ large enough that each prime ${p}_{l}$ is raised to a power greater than its power in $n$, and then choose the coefficient $a$ to make up the deficit.

On the other hand, if non-zero $r$ is known to be in the nilradical, then we have ${r}^{m}=an$ for some $m,a$. If $p$ is a prime dividing $n$, then it also must divide the left hand side, $p\mid {r}^{m}$. As $p$ is a prime, we have further that $p\mid r$. This holds for each prime, so every prime dividing $n$ must also appear, with at least one power, in $r$. This proves the claim.

As an example, consider $\mathbb{Z}\mathrm{/}24\mathbb{Z}$ where $24={2}^{3}\cdot 3$. The prime product is $2\cdot 3=6$. The elements of the nilradical of $\mathbb{Z}\mathrm{/}24\mathbb{Z}$ are $6$ (${6}^{3}=24\cdot 9$), $2\cdot 6$ ($1{2}^{2}=24\cdot 6$), $3\cdot 6$ ($1{8}^{3}=24\cdot 243$) and of course $0$. On the other hand, something like ${2}^{1}\cdot {3}^{0}$ is not in the nilradical because $24a$ is always divisible by $3$ whereas ${2}^{m}$ never is.

Because $A$ is an ideal (and therefore closed under external multiplication), $ab\in A$; because $B$ is an ideal, $ab\in B$. Therefore $ab\in A\cap B=\left(0\right)$ so $ab=0$.

This is the analogue of the center of a group, the set of all elements that commute with everything. Let ${x}_{1},{x}_{2}\in Z\left(R\right)$ and let $y$ be an arbitrary element of $R$. Then ${x}_{1}{x}_{2}y={x}_{1}\left({x}_{2}y\right)={x}_{1}\left(y{x}_{2}\right)=\left(y{x}_{1}\right){x}_{2}=y{x}_{1}{x}_{2}$

showing that ${x}_{1}{x}_{2}\in Z\left(R\right)$. Similarly, $({x}_{1}+{x}_{2})y=y({x}_{1}+{x}_{2})$ so that ${x}_{1}+{x}_{2}\in Z\left(R\right)$. Of course, if there exists $1\in R$, then $1\in Z\left(R\right)$. Therefore $Z\left(R\right)$ is a subring of $R$. It is patently a commutative ring.

By exercise 3.11, we know that $Z\left(R\right)$ is a commutative subring. If $R$ is a division ring then so too is $Z\left(R\right)$. A commutative division ring is a field, by definition.

It seems a difficult problem in general to construct arbitrary irreducible polynomials. However, with small degree and a small field of coefficients, we can force it through. We observe that we may restrict attention to $f\in F\left[x\right]$ monic, because $F$ is a field so the highest coefficient is invertible and factoring it out does not affect reducibility. In addition, a reducible degree $3$ polynomial must have a linear factor because the only non-trivial way to partition $3$ is $1+2$. Then if $f$ is monic and reducible, we will be able to write $f\left(x\right)=(x-\alpha )({x}^{2}+\beta x+\gamma )$

where the latter polynomial may be further reducible, but that is of no concern. Clearly such a polynomial must map some element $\alpha \in F$ to zero. Returning to the problem at hand, we now know that a degree $3$ polynomial $f\in F\left[x\right]$ which has no root in $F$ must be irreducible.

Now a simple brute force search is easy. We restrict to monic degree $3$ polynomials, searching for one which doesn’t map any of $0,1,2$ to zero. For simplicity, we keep the ${x}^{2}$ term out of it. The first thing to try is $f\left(x\right)={x}^{3}+x+1$ but it has $f\left(1\right)=0$. Next, $f\left(x\right)={x}^{3}+2x+1$, which works: $f\left(0\right)=1$, $f\left(1\right)=1$ and $f\left(2\right)=1$. Therefore one such irreducible polynomial is $f\left(x\right)={x}^{3}+2x+1,$

and there are other possibilities.

By exercise 3.9.7, $F\left[x\right]\mathrm{/}({x}^{3}+2x+1)$ is a field of ${3}^{3}=27$ elements.

Note that the connection found here between roots and irreducibility is not of general use. There are sometimes polynomials over fields which have no roots but are nevertheless reducible, such as $({x}^{2}+1{)}^{2}={x}^{4}+2{x}^{2}+1$ over $\mathbb{R}$. The observation is special for degree $3$, where any reduction involves a degree one term. It does not even extend to higher odd degrees, because one can imagine a fifth degree polynomial that splits into irreducible factors of degree $2$ and $3$, neither of which has a root, e.g. $({x}^{2}-2)({x}^{3}-2)={x}^{5}-2{x}^{3}-2{x}^{2}+4$ over $\mathbb{Q}$ (both factors are irreducible by exercise 3.10.2).

$625={5}^{4}$ so exercise 3.9.7 suggests that we look for a degree $4$ polynomial $f$ irreducible over $F=\mathbb{Z}\mathrm{/}5\mathbb{Z}$. Then $F\left[x\right]\mathrm{/}\left(f\right)$ will be the desired field.

For degree $4$, the methods of the previous problem are not helpful. Therefore we try brute force: writing down the simplest polynomials and manually checking that they are irreducible, hoping to get lucky. A few observations are helpful. 1) If $f\in F\left[x\right]$ were reducible, it could be factored into two degree $2$ polynomials, or into a degree $1$ and a degree $3$ polynomial. In the latter case, $f$ must have a root in $F$ due to its linear factor. Therefore we look for candidates which have no roots in $F$, but that is necessary and not sufficient. 2) Fermat’s little theorem says that ${x}^{4}=1$ in our field. 3) The quadratic residues modulo $5$ are $\{0,1,4\}$.

$f\left(x\right)={x}^{4}+1$ has no root because ${x}^{4}+1=2$ for $x\in F$. We then try to factor it as ${x}^{4}+1=({x}^{2}+\alpha x+\beta )({x}^{2}+\gamma x+\delta )$

and find that the equations for each coefficient have a consistent solution: $\alpha =\gamma =0$, $\beta =2$, $\delta =3$. Thus ${x}^{4}+1=({x}^{2}+2)({x}^{2}+3)$ is reducible.

$f\left(x\right)={x}^{4}+x+1$ has a root at $x=3$. More generally, ${x}^{4}+kx+1$ takes the values $2+kx$, and quick inspection shows that any non-zero $k$ gives a polynomial with a root, which is reducible.

$f\left(x\right)={x}^{4}+{x}^{2}+1$ takes the values $2+{x}^{2}\in \{1,2,3\}$ on $F$, so it has no roots. However, again writing down the equations for the coefficients in a product of quadratics, we find that ${x}^{4}+{x}^{2}+1=({x}^{2}+x+1)({x}^{2}-x+1)$ is reducible.

$f\left(x\right)={x}^{4}+{x}^{2}+x+1$ takes the values $2+x+{x}^{2}\in \{2,3,4\}$ on $F$, so it has no roots. The equations for the coefficients are $\gamma =-\alpha $, $\delta ={\beta}^{-1}$, $\beta +\delta +\alpha \gamma =1$ and $\alpha \delta +\beta \gamma =1$. Combining these four equations, we see that $\beta (\beta -{\alpha}^{2})=0\phantom{\rule{2em}{0ex}}\mathrm{a}\mathrm{n}\mathrm{d}\phantom{\rule{2em}{0ex}}\alpha (1-{\beta}^{2})=1.$

From the first of these new equations, we see that $\beta ={\alpha}^{2}$, because $\beta \delta =1$ precludes the possibility that $\beta =0$. Next, we see that $\alpha (1-{\alpha}^{4})=1$. However, we know that ${\alpha}^{4}=1$ by little Fermat, so we have our desired contradiction at long last. Thus $f\left(x\right)={x}^{4}+{x}^{2}+x+1$ is irreducible over $F$ and $F\left[x\right]\mathrm{/}\left(f\right)$ is a field of order $625$.

For an element $f\in F\left[x\right]$ to be in the nilradical $N$ of $R$, it means there exists an integer $n$ such that $p\left(x\right)$ divides $f(x{)}^{n}$. First we show that $N$ being trivial implies that $p$ cannot be divisible by a square. Consider the contrapositive statement: if $p$ is divisible by a square, say $p={f}^{2}g$ with $f,g\in F\left[x\right]$, then $(fg{)}^{2}=pg\in (p)$ so that $N$ is non-trivial because $fg$ is in it. Hence if $N$ is trivial then $p$ must not be divisible by a square.

Now suppose that $p$ is not divisible by any square. Using the fact that $F\left[x\right]$ is a UFD, we can write $p={\pi}_{1}\cdots {\pi}_{m}$ with the ${\pi}_{i}\in F\left[x\right]$ irreducible and all being unique, ${\pi}_{i}\mathrm{\ne}{\pi}_{j}$ if $i\mathrm{\ne}j$. If $f\in N$, then there is $n\in \mathbb{Z}$ with $p\mid {f}^{n}$. Then by lemma 3.7.6 every ${\pi}_{i}\mid f$ and therefore $p\mid f$. Viewed in the quotient ring $R$, $f=0$. Hence if $p$ is not divisible by a square, the nilradical of $R$ is trivial.

Note that the squarefree property is truly necessary in the second paragraph. Take this example in the integers: $p=18=2\cdot {3}^{2}$ and $f=6$. We have $p\mid {f}^{2}$ but of course $p\nmid f$.

We observe that $-1$, which belongs to any field, is a root of $f$. Hence $f\left(x\right)$ is divisible by the polynomial $(x+1)$ and is therefore irreducible.

This is immediate from the Eisenstein criterion with $p=2$.

First, the notion of characteristic (as defined by Herstein) is relevant here because a field is an integral domain (exercise 3.2.12). The characteristic of $F$ is finite by the pigeonhole principle applied to the set $\{1,2\cdot 1,3\cdot 1,\dots \}$: the list must repeat because $F$ is finite, so there exist $\alpha ,\beta \in \mathbb{Z}$ such that $\alpha 1=\beta 1$; therefore $\mathrm{\mid}\alpha -\beta \mathrm{\mid}1=0$. By exercise 3.2.6, the characteristic of $F$ is prime because it is finite.

Because $F$ has characteristic $p$, we know that ${F}_{0}=\{0,1,2,\dots ,p-1\}\subset F\mathrm{.}$

Now, we are familiar with ${F}_{0}\cong \mathbb{Z}\mathrm{/}p\mathbb{Z}$; it is a field containing $p$ elements. If $F={F}_{0}$ then we have shown that $\mathrm{\mid}F\mathrm{\mid}={p}^{1}$ and we are done. However, suppose there exists $x\in F$ with $x\ue020\in {F}_{0}$, and consider the set ${F}_{1}=\{{\alpha}_{1}x+{\alpha}_{0}\mid {\alpha}_{0},{\alpha}_{1}\in {F}_{0}\}\mathrm{.}$

Clearly ${F}_{0}\subset {F}_{1}\subset F$, and we will show in a moment that it contains ${p}^{2}$ elements. Note that we will **not** claim that ${F}_{1}$ is a field or even closed under multiplication. For instance, it is not clear at all at this point whether ${x}^{2}$ would belong to ${F}_{1}$. Nevertheless, we will find this construction useful. The sketch of the proof from this point is as follows: we repeat this procedure for as long as $F$ contains an element outside our constructed subsets. The procedure surely terminates because each step generates a proper superset of the preceding step’s set, and $F$ is finite. Moreover, the set generated in any step is $p$ times as large as the preceding set. Hence, when the procedure terminates, we realize that the size of $F$ must be a power of $p$.

First we show that ${F}_{1}$ contains ${p}^{2}$ elements. It is clear that we can enumerate ${p}^{2}$ elements in ${F}_{1}$ because there are $p$ choices for each of the coefficients ${\alpha}_{0}$ and ${\alpha}_{1}$. However, must they all be unique? Yes. If ${\alpha}_{1}^{\mathrm{\prime}}x+{\alpha}_{0}^{\mathrm{\prime}}={\alpha}_{1}x+{\alpha}_{0}$, then we see that $({\alpha}_{1}^{\mathrm{\prime}}-{\alpha}_{1})x={\alpha}_{0}-{\alpha}_{0}^{\mathrm{\prime}}\mathrm{.}$

If ${\alpha}_{1}^{\mathrm{\prime}}-{\alpha}_{1}$ is non-zero, then it is an invertible element of ${F}_{0}$, so that we have $x=({\alpha}_{1}^{\mathrm{\prime}}-{\alpha}_{1}{)}^{-1}({\alpha}_{0}-{\alpha}_{0}^{\mathrm{\prime}})\in {F}_{0},$

a contradiction. Therefore ${\alpha}_{1}^{\mathrm{\prime}}={\alpha}_{1}$ and consequently ${\alpha}_{0}^{\mathrm{\prime}}={\alpha}_{0}$, the element is unique. This proves that there are ${p}^{2}$ elements in ${F}_{1}\subset F$.

Now we would like to show the validity of the procedure in general. Suppose ${F}_{k-1}\subset F$ with $\mathrm{\mid}{F}_{k-1}\mathrm{\mid}={p}^{k}$, and suppose there exists $y\in F$ with $y\ue020\in {F}_{k-1}$. Then construct ${F}_{k}=\{{\alpha}_{k}y+\beta \mid {\alpha}_{k}\in {F}_{0},\text{}\beta \in {F}_{k-1}\}\mathrm{.}$

We can enumerate ${p}^{k}\cdot p={p}^{k+1}$ elements in ${F}_{k}$, and clearly ${F}_{k}\subset F$. Are any of the ${p}^{k+1}$ elements duplicates? No. If ${\alpha}_{k}^{\mathrm{\prime}}y+{\beta}^{\mathrm{\prime}}={\alpha}_{k}y+\beta ,$

then $({\alpha}_{k}^{\mathrm{\prime}}-{\alpha}_{k})y=\beta -{\beta}^{\mathrm{\prime}}$ and the argument from above applies again: if ${\alpha}_{k}^{\mathrm{\prime}}-{\alpha}_{k}$ is non-zero, then it is an invertible element of ${F}_{0}$ and $y=({\alpha}_{k}^{\mathrm{\prime}}-{\alpha}_{k}{)}^{-1}(\beta -{\beta}^{\mathrm{\prime}})\mathrm{.}$

We know, however, that ${F}_{k-1}$ is closed under multiplication by elements of ${F}_{0}$, because ${F}_{0}$ is a field. Thus, this line of reasoning has us conclude that $y\in {F}_{k-1}$, a contradiction. Therefore we have ${\alpha}_{k}^{\mathrm{\prime}}={\alpha}_{k}$ and consequently ${\beta}^{\mathrm{\prime}}=\beta $, the two elements are identical. Now we have shown the recurrence $\mathrm{\mid}{F}_{k}\mathrm{\mid}=p\mathrm{\mid}{F}_{k-1}\mathrm{\mid}$.

Because $F$ is finite, it will eventually be exhausted of its elements and we will have some maximal $n$ such that ${F}_{n}\subset F$, but there does not exist another $z\in F$ with $z\ue020\in {F}_{n}$. But this means that $F\subset {F}_{n}$, so that $F={F}_{n}$ and $\mathrm{\mid}F\mathrm{\mid}={p}^{n+1}$. This is the desired result: the number of elements in a finite field must be a power of a prime.

Finally, suppose that $\mathrm{\mid}F\mathrm{\mid}={p}^{n}$. We have that the non-zero elements of $F$ form a group under multiplication, and there are ${p}^{n}-1$ of them. By Lagrange’s theorem, the order of any element divides ${p}^{n}-1$. Then if $a\in F$, we surely have ${a}^{{p}^{n}-1}=1$

or, what is the same, ${a}^{{p}^{n}}=a\mathrm{.}$

Let $I\subset \mathbb{Z}\left[i\right]$ be a non-zero ideal and let $z=a+ib\in I$ with $a,b\in \mathbb{Z}$ not both zero. Then $z\stackrel{\u02c9}{z}=(a+ib)(a-ib)={a}^{2}+{b}^{2}\in I$

because $I$ is closed under external multiplication. Because one of $a$ or $b$ is non-zero, ${a}^{2}+{b}^{2}$ is a positive integer.

Like exercise 3.4.19 (where ${x}^{3}=x$ for all $x$ implies commutativity), this problem is hard. We follow the great proof posted by Steve D. on math.stackexchange.com.

First observe that $-x=(-x{)}^{4}={x}^{4}=x$ so that $2x=0$ for any $x\in R$. Next, we consider the magic combination ${x}^{2}+x$, which was also of interest in 3.5.19. We can try to take the fourth power, but it gives no information. However, we have that $({x}^{2}+x{)}^{2}={x}^{4}+{x}^{2}={x}^{2}+x\mathrm{.}$

Now we make some subtle, and seemingly unrelated, statements.

**(1)** If $x,y\in R$ have $xy=0$, then $yx=0$. Using the special property of $R$, we have $yx=(yx{)}^{4}=y(xy{)}^{3}x=0$.

**(2)** If $x\in R$ has ${x}^{2}=x$, then $x$ commutes with every element of $R$: let $y\in R$ and consider
$0=xy-{x}^{2}y=x(y-xy)=(y-xy)x,$

so that $yx=xyx$. In the final step, we used property (1). Now do this again, $0=yx-y{x}^{2}=(y-yx)x=x(y-yx),$

which gives $xy=xyx$. Combining the two, we see that if ${x}^{2}=x$ then $xy=xyx=yx$ for any $y\in R$.

We are not done, because not every $x\in R$ satisfies ${x}^{2}=x$. Let $r,s\in R$ and expand the equality $r\left(\right(r+s{)}^{2}+(r+s))=\left(\right(r+s{)}^{2}+(r+s))r$

which holds because $t=(r+s{)}^{2}+(r+s)$ satisfies ${t}^{2}=t$. Now, canceling the identical terms, we are left with $({r}^{2}+r)s+r{s}^{2}=s({r}^{2}+r)+{s}^{2}r,$

but we know that ${r}^{2}+r$ commutes with $s$, so we have $r{s}^{2}={s}^{2}r$ for arbitrary $r,s\in R$. Finally, we can make the statement that $rs=(r+{r}^{2})s-{r}^{2}s=s(r+{r}^{2})-s{r}^{2}=sr$

for any $r,s\in R$, so that $R$ is commutative.

We follow Herstein’s hint, which is to fix $a\in R$ and to consider the sets ${W}_{a}=\{x\in R\mid \varphi (ax)=\varphi (a\left)\varphi \right(x\left)\right\}$

and ${V}_{a}=\{x\in R\mid \varphi (ax)=\varphi (x\left)\varphi \right(a\left)\right\}\mathrm{.}$

That is, ${W}_{a}$ is those $x$ that fall into the first category and ${V}_{a}$ is those $x$ that fall into the second category. We must have ${W}_{a}\cup {V}_{a}=R$ and, of course, $a$ belongs to both so that neither is empty. We seek to prove that one or both of the sets is equal to $R$.

Suppose there exists $b\in {W}_{a}$ with $b\ue020\in {V}_{a}$, so that $\varphi \left(ab\right)=\varphi \left(a\right)\varphi \left(b\right)$ while $\varphi \left(ab\right)\mathrm{\ne}\varphi \left(b\right)\varphi \left(a\right)$. If $c\in R$ is arbitrary, consider $\varphi \left(a\right(b+c\left)\right)=\varphi \left(ab\right)+\varphi \left(ac\right)=\varphi \left(a\right)\varphi \left(b\right)+\varphi \left(ac\right)\mathrm{.}$

We can evaluate this quantity in another way: either $\varphi \left(a\right(b+c\left)\right)=\varphi \left(a\right)\varphi (b+c)\phantom{\rule{2em}{0ex}}\mathrm{o}\mathrm{r}\phantom{\rule{2em}{0ex}}\varphi \left(a\right(b+c\left)\right)=\varphi (b+c)\varphi \left(a\right)\mathrm{.}$

If the first holds, we would have $\varphi \left(a\right)\varphi \left(b\right)+\varphi \left(ac\right)=\varphi \left(a\right)\varphi \left(b\right)+\varphi \left(a\right)\varphi \left(c\right)$

so that $c\in {W}_{a}$ as desired. The second case leads to $\varphi \left(ac\right)=\varphi \left(c\right)\varphi \left(a\right)+\varphi \left(b\right)\varphi \left(a\right)-\varphi \left(a\right)\varphi \left(b\right)\mathrm{\ne}\varphi \left(c\right)\varphi \left(a\right)$

where we use the fact that $\varphi \left(b\right)\varphi \left(a\right)-\varphi \left(a\right)\varphi \left(b\right)\mathrm{\ne}0$. Again, we must conclude that $\varphi \left(ac\right)=\varphi \left(a\right)\varphi \left(c\right)$. Therefore if there exists $b\in {W}_{a}$ with $b\ue020\in {V}_{a}$, then ${W}_{a}=R$. If the circumstance is reversed, so there exists $b\in {V}_{a}$ with $b\ue020\in {W}_{a}$, the same argument gives that ${V}_{a}=R$ in that case.

The only other case to consider is ${W}_{a}\subset {V}_{a}$ or vice versa. Say ${W}_{a}\subset {V}_{a}$, then we know that $R={W}_{a}\cup {V}_{a}={V}_{a}$. Therefore for any fixed $a\in R$, one of ${W}_{a}$ or ${V}_{a}$ is the entire ring.

Now we know that, for fixed $a\in R$, either ${W}_{a}$ or ${V}_{a}$ is the whole of $R$. We must extend this to a global statement about $R$ itself. Following the hint of this math.stackexchange.com post, we consider the two sets $A=\{a\in R\mid {W}_{a}=R\}\phantom{\rule{2em}{0ex}}\mathrm{a}\mathrm{n}\mathrm{d}\phantom{\rule{2em}{0ex}}B=\{a\in R\mid {V}_{a}=R\}\mathrm{.}$

As we know, $A\cup B=R$. It is also easy to see that each of $A$ and $B$ is closed under addition: e.g. $a,{a}^{\mathrm{\prime}}\in A$ implies $\varphi \left(\right(a+{a}^{\mathrm{\prime}}\left)x\right)=\varphi \left(ax\right)+\varphi \left({a}^{\mathrm{\prime}}x\right)=\left(\varphi \right(a)+\varphi ({a}^{\mathrm{\prime}}\left)\right)\varphi \left(x\right)=\varphi (a+{a}^{\mathrm{\prime}})\varphi \left(x\right)$

so that $a+{a}^{\mathrm{\prime}}\in A$. Suppose that $A\mathrm{\ne}R$ and $B\mathrm{\ne}R$. If that is the case, then there exists $a\in A$ with $a\ue020\in B$, and there exists $b\in B$ with $b\ue020\in A$. Then to which set does $a+b$ belong? If $a+b\in A$, then $b=(a+b)-a\in A$ is a contradiction. If $a+b\in B$, then $a=(a+b)-b\in B$ is a contradiction. Therefore we must conclude that one (or both) of $A$ or $B$ is the entire ring, which is the desired result.

**Note**: This very verbose “elementary” argument can be greatly simplified by the result mentioned in the linked hint. Specifically, if $G$ is a group and ${G}_{1},{G}_{2}\le G$ satisfy ${G}_{1}\cup {G}_{2}=G$, then ${G}_{1}=G$ or ${G}_{2}=G$. The proof is essentially what was done in the preceding paragraph. Namely, this argument by contradiction: If ${G}_{1}\mathrm{\ne}G$ and ${G}_{2}\mathrm{\ne}G$ but ${G}_{1}\cup {G}_{2}=G$, then there exists ${g}_{1}\in {G}_{1}$ with ${g}_{1}\ue020\in {G}_{2}$ and there exists ${g}_{2}\in {G}_{2}$ with ${g}_{2}\ue020\in {G}_{1}$. Now if ${g}_{1}{g}_{2}\in {G}_{1}$, then ${g}_{2}={g}_{1}^{-1}\left({g}_{1}{g}_{2}\right)\in {G}_{1}$ is a contradiction. On the other hand, if ${g}_{1}{g}_{2}\in {G}_{2}$, then ${g}_{1}=\left({g}_{1}{g}_{2}\right){g}_{2}^{-1}\in {G}_{2}$ is also a contradiction. Hence one or both subgroups is the entire group $G$.

In the context of this exercise, we can use this result twice. Note that ${W}_{a}$ and ${V}_{a}$ are additive subgroups of $R$ whose union is $R$, so one or the other must be the entire ring. Then $A$ and $B$ are also additive subgroups of $R$ whose union is $R$.

This problem is a standard follow-your-nose element manipulation exercise. In light of the subsequent exercises, it’s clearly going to be important that $1\in R$, so we start by considering things like $a(1+b)$. Let $a,b\in R$ be arbitrary. We have $\left[a\right(1+b){]}^{2}=(a+ab{)}^{2}={a}^{2}+{a}^{2}b+aba+(ab{)}^{2}$

but also, using the problem stipulation, $\left[a\right(1+b){]}^{2}={a}^{2}(1+b{)}^{2}={a}^{2}+2{a}^{2}b+{a}^{2}{b}^{2}\mathrm{.}$

This simplifies to give ${a}^{2}b=aba$. Similarly, consideration of $\left[\right(1+a)b{]}^{2}$ gives $a{b}^{2}=bab$.

Finally, if we expand $\left[\right(1+a\left)\right(1+b){]}^{2}=(1+a{)}^{2}(1+b{)}^{2}$

and cancel the obvious terms, we are left with $ba+bab+aba=ab+{a}^{2}b+a{b}^{2}\mathrm{.}$

Using the previous two results, this simplifies to $ab=ba$. Therefore $R$ is commutative.

By exercise 3.22, the ring cannot be unital. The condition can be rewritten as $a(ba-ab)a=0$

so we want most or all of the elements of $R$ to be zero divisors (in some cases $ab-ba$ may be zero so we can’t make a general statement). My go-to examples of non-commutative rings are the quaternions and rings of matrices. Making even the simplest computations in the quaternions, we have things like $(ij{)}^{2}=-1$ while ${i}^{2}{j}^{2}=+1$. It is unlikely that a subring of the quaternions will satisfy the condition of this problem, so we set it aside.

I tried many things with $2\times 2$ matrices which all ultimately failed to pan out. For instance, rings generated by simple (single non-zero entry) matrices with entries from the even integers did not satisfy the condition of the problem, and those matrices which square to zero (which easily satisfy the condition of the problem) end up giving a commutative subring.

The space of $3\times 3$ matrices is a big one, so it is natural to restrict attention to one famous subring, the upper triangular matrices. It’s even a good idea to restrict further still, and only consider matrices with zeroes on the diagonal. It turns out that this works. Let $A=\left(\begin{array}{ccc}{\textstyle 0}& {\textstyle 1}& {\textstyle 0}\\ {\textstyle 0}& {\textstyle 0}& {\textstyle 0}\\ {\textstyle 0}& {\textstyle 0}& {\textstyle 0}\end{array}\right),\phantom{\rule{2em}{0ex}}B=\left(\begin{array}{ccc}{\textstyle 0}& {\textstyle 0}& {\textstyle 1}\\ {\textstyle 0}& {\textstyle 0}& {\textstyle 0}\\ {\textstyle 0}& {\textstyle 0}& {\textstyle 0}\end{array}\right),\phantom{\rule{2em}{0ex}}C=\left(\begin{array}{ccc}{\textstyle 0}& {\textstyle 0}& {\textstyle 0}\\ {\textstyle 0}& {\textstyle 0}& {\textstyle 1}\\ {\textstyle 0}& {\textstyle 0}& {\textstyle 0}\end{array}\right)\mathrm{.}$

Then we consider the set $R=\{\alpha A+\beta B+\gamma C\mid \alpha ,\beta ,\gamma \in \mathbb{Z}\}\mathrm{.}$

Note that $AC=B$ while every other product of $A,B,C$ is zero. Thus it is trivial that $R$ is closed under multiplication, because $(\alpha A+\beta B+\gamma C)({\alpha}^{\mathrm{\prime}}A+{\beta}^{\mathrm{\prime}}B+{\gamma}^{\mathrm{\prime}}C)=\alpha {\gamma}^{\mathrm{\prime}}B\in R\mathrm{.}$

Furthermore, $R$ is non-commutative, because $({\alpha}^{\mathrm{\prime}}A+{\beta}^{\mathrm{\prime}}B+{\gamma}^{\mathrm{\prime}}C)(\alpha A+\beta B+\gamma C)={\alpha}^{\mathrm{\prime}}\gamma B$

will generally not be the same as the previous product. Of course, $R$ is closed under addition. Therefore $R$ is a non-commutative subring of ${\mathrm{M}\mathrm{a}\mathrm{t}}_{3\times 3}\left(\mathbb{Z}\right)$. It also satisfies the condition of the problem, because any product of four matrices in $R$ is proportional to ${B}^{2}=0$. More explicitly, $\left[\right(\alpha A+\beta B+\gamma C\left)\right({\alpha}^{\mathrm{\prime}}A+{\beta}^{\mathrm{\prime}}B+{\gamma}^{\mathrm{\prime}}C){]}^{2}=(\alpha {\gamma}^{\mathrm{\prime}}B{)}^{2}=0$

and $\left[\right(\alpha A+\beta B+\gamma C\left){]}^{2}\right[({\alpha}^{\mathrm{\prime}}A+{\beta}^{\mathrm{\prime}}B+{\gamma}^{\mathrm{\prime}}C){]}^{2}=\left(\alpha \gamma B\right)\left({\alpha}^{\mathrm{\prime}}{\gamma}^{\mathrm{\prime}}B\right)=0.$

Note that the ring of coefficients for $R$ is really immaterial. Even ${\mathbb{Z}}_{2}$ would suffice, furnishing us with an eight-element ring satisfying the condition of the problem.

This exercise illustrates just how fragile the conditions in (a) are for ensuring that $R$ is commutative. If they are relaxed in any respect, $R$ no longer needs to be commutative.

**(a)** This is another follow-your-nose elementary manipulation. Based on the problem statement, we would like to end up with a result like $2(ab-ba)=0$. First note that
$\left[a\right(1+b){]}^{2}=(a+ab{)}^{2}={a}^{2}+{a}^{2}b+aba+(ab{)}^{2},$

but this is also equal to $\left[\right(1+b)a{]}^{2}=(a+ba{)}^{2}={a}^{2}+aba+b{a}^{2}+(ba{)}^{2}\mathrm{.}$

Comparing the two, we have ${a}^{2}b=b{a}^{2}$ for any $a,b\in R$. In other words, a square commutes with anything. Now, in particular, it is true that $(1+a{)}^{2}b=b(1+a{)}^{2}$. We write $0=(1+a{)}^{2}b-b(1+a{)}^{2}=b+2ab+{a}^{2}b-b-2ba-b{a}^{2}=2(ab-ba),$

again using the property derived above. Because $2x=0$ implies $x=0$, we have that $ab=ba$ for arbitrary $a,b\in R$ and thus $R$ is commutative.

**(b)** Here we seek to construct a non-commutative ring with $(ab{)}^{2}=(ba{)}^{2}$ for all $a,b\in R$. By part (a), we have the hint that this ring must contain some non-zero element $x$ with $2x=0$. This naturally suggests things like $\mathbb{Z}\mathrm{/}2\mathbb{Z}$ and $\mathbb{Z}\mathrm{/}4\mathbb{Z}$. In order to get non-commutativity, it’s then natural to look at matrices over those rings.

As in problem 3.23, I tried various rings of $2\times 2$ matrices over $\mathbb{Z}\mathrm{/}2\mathbb{Z}$ and $\mathbb{Z}\mathrm{/}4\mathbb{Z}$, but they always failed. One can easily write down explicit expressions for $2\times 2$ matrices $a,b$ with $(ab{)}^{2}=(ba{)}^{2}$, and the resulting ring always ends up commutative. Convinced of the futility of $2\times 2$ matrices, we look at $3\times 3$ matrices over $\mathbb{Z}\mathrm{/}2\mathbb{Z}$.

The solution of 3.23 does not work here because $R$ is required to be unital and the strictly upper-triangular matrices lack an identity element. However, if we include the diagonal, then we have it: recall the notation $A,B,C$ from the solution to 3.23, above, and consider the set $R=\{x1+\alpha A+\beta B+\gamma C\mid x,\alpha ,\beta ,\gamma \in \mathbb{Z}\mathrm{/}2\mathbb{Z}\}$

($1$ is the identity matrix). $R$ is the $16$ element set of all $3\times 3$ upper-triangular matrices over $\mathbb{Z}$, which we know to be a ring. It is unital ($x=1$, $\alpha =\beta =\gamma =0$) and non-commutative: $1+A+B+C=(1+A)(1+C)\mathrm{\ne}(1+C)(1+A)=1+A+C\mathrm{.}$

Crucially, it also satisfies the condition $(ab{)}^{2}=(ba{)}^{2}$ for all $a,b\in R$: letting $\mathbf{x}=(x,\alpha ,\beta ,\gamma )$ and $\mathbf{E}=(1,A,B,C)$, we have $(\mathbf{x}\cdot \mathbf{E})({\mathbf{x}}^{\mathrm{\prime}}\cdot \mathbf{E})=x{x}^{\mathrm{\prime}}+(x{\alpha}^{\mathrm{\prime}}+\alpha {x}^{\mathrm{\prime}})A+(x{\beta}^{\mathrm{\prime}}+\beta {x}^{\mathrm{\prime}}+\alpha {\gamma}^{\mathrm{\prime}})B+(x{\gamma}^{\mathrm{\prime}}+\gamma {x}^{\mathrm{\prime}})C,$

$({\mathbf{x}}^{\mathrm{\prime}}\cdot \mathbf{E})(\mathbf{x}\cdot \mathbf{E})=x{x}^{\mathrm{\prime}}+(x{\alpha}^{\mathrm{\prime}}+\alpha {x}^{\mathrm{\prime}})A+(x{\beta}^{\mathrm{\prime}}+\beta {x}^{\mathrm{\prime}}+{\alpha}^{\mathrm{\prime}}\gamma )B+(x{\gamma}^{\mathrm{\prime}}+\gamma {x}^{\mathrm{\prime}})C\mathrm{.}$

Squaring, $\left[\right(\mathbf{x}\cdot \mathbf{E}\left)\right({\mathbf{x}}^{\mathrm{\prime}}\cdot \mathbf{E}){]}^{2}=2x{x}^{\mathrm{\prime}}(\mathbf{x}\cdot \mathbf{E}\left)\right({\mathbf{x}}^{\mathrm{\prime}}\cdot \mathbf{E})-(x{x}^{\mathrm{\prime}}{)}^{2}+(x{\alpha}^{\mathrm{\prime}}+\alpha {x}^{\mathrm{\prime}})(x{\beta}^{\mathrm{\prime}}+\beta {x}^{\mathrm{\prime}})B,$

$\left[\right({\mathbf{x}}^{\mathrm{\prime}}\cdot \mathbf{E}\left)\right(\mathbf{x}\cdot \mathbf{E}){]}^{2}=2x{x}^{\mathrm{\prime}}({\mathbf{x}}^{\mathrm{\prime}}\cdot \mathbf{E}\left)\right(\mathbf{x}\cdot \mathbf{E})-(x{x}^{\mathrm{\prime}}{)}^{2}+(x{\alpha}^{\mathrm{\prime}}+\alpha {x}^{\mathrm{\prime}})(x{\beta}^{\mathrm{\prime}}+\beta {x}^{\mathrm{\prime}})B\mathrm{.}$

Because $2=0$, the first terms vanish and the result is proven.

I was stuck on this problem until getting guidance from Jack Schmidt in