Automorphism Groups of Simple Graphs

September 26, 2009

Theorem. Every finite group is the automorphism group of a simple graph.

We first show that it is sufficient to construct a weighted directed graph having the given group as automorphism group. A weighted directed graph is a triple (V,W,E), where V is a finite set of vertices, W is a finite set of weights and E:V\times V \rightarrow W is a partial map. An automorphism of (V,W,E) is a bijection \varphi:V\rightarrow V such that for all v,w\in V, E(\varphi(v),\varphi(w)) is defined if and only if E(v,w) is defined in which case E(\varphi(v),\varphi(w))=E(v,w). The group of all automorphisms of (V,W,E) is denoted by \mathrm{Aut}(V,W,E). If v\in V, then \mathrm{deg} v := |\{ w\in V \mid (v,w)\in \mathrm{D}(E) \}| + |\{ w\in V \mid (w,v)\in \mathrm{D}(E) \}| denotes the (total) degree of v.

Lemma. Let (V,W,E) be a weighted directed graph with \mathrm{deg} v \geq 2 for all v\in V. Then there exists a simple graph G with \mathrm{Aut}(G) = \mathrm{Aut}(V,W,E).
Proof. Wlog assume that W=\{1,2,\ldots,N\} for some N. Construct G from (V,W,E) by replacing every edge (v,w) with E(v,w)=k by

and notice that every automorphism of (V,W,E) induces an automorphism of G in the canonical way while it is easy to see that every automorphism of G must necessarily permute the original vertices and thus be equal to one of the induced automorphisms.


Let G be an arbitrary finite group with order at least three (the other cases are trivial). Construct a weighted directed graph by putting V=W=G and E(x,y) = x^{-1}y for all x,y\in G, and for g \in G, let \varphi_g: G\rightarrow G, x\mapsto gx. Note that \varphi_g is an automorphism of the graph for all g \in G since E(\varphi_g(x),\varphi_g(y)) = (gx)^{-1}gy = x^{-1}y = E(x,y) and that \varphi_g\circ \varphi_h = \varphi_{gh}.
Furthermore, if \varphi is an automorphism of the graph, let g:=\varphi(1) and x\in G be arbitrary. Then E(g, \varphi(x)) = E(\varphi(1),\varphi(x)) = E(1,x)=x, so \varphi(x) = gx, i.e. \varphi=\varphi_g. It follows that \mathrm{Aut}(V,W,E) = \{ \varphi_g \mid g\in G\} \simeq G and by the lemma above, there exists also a simple graph with automorphism group G.


The Sum of Primitive Roots of Unity

June 9, 2009

Let \zeta_n=\exp(2i\pi/n) be the canonical primitive nth root of unity. The identity

\displaystyle \sum_{k=1}^{n} \zeta_n^k = \begin{cases} 1 &,\quad n=1\\ 0 &,\quad n\geq 2\end{cases}

is well known and easy to prove. One can now ask what happens if the sum is taken only over the primitive nth roots of unity.

\displaystyle s(n) := \sum_{\substack{1\leq k\leq n\\ \gcd(k,n)=1}} \zeta_n^k

be the sum of all primitive nth roots of unity. Since every nth root of unity is a primitive dth root of unity for some divisor d of n, we have, for all positive integers n,

\displaystyle \sum_{d|n} s(d) = \sum_{k=1}^n \zeta_n^k = \begin{cases} 1 &,\quad n=1\\ 0 &,\quad n\geq 2\end{cases},

This property however is the characterisation of the Möbius \mu function (namely being the Dirichlet inverse wrt the constant 1 function), hence, s(n)=\mu(n).

The Probability of Coprimality

April 4, 2009

I recently found the following problem asking for the probability of two randomly chosen integers to be coprime on a problem sheet which, as I have later been told, is a classical result already discovered by Dirichlet.

Problem. Choose two natural numbers x, y \in \{1, 2,\ldots,n\} randomly (with uniform probability) and independently. Let P_n be the probability that x and y are coprime. Find \lim_{n\rightarrow \infty} P_n.

The result is 6/\pi^2=1/\zeta(2) but before proving this rigorously, I will present two heuristic approaches here.

Heuristic 1. Let p be a prime number. Then intuitively, the probability that two randomly chosen integers are not both divisible by p is 1-1/p^2. Thus, the probability that two randomly chosen integers are neither both divisible by p_1 nor by p_{2} etc is

\displaystyle \prod_{k=1}^\infty \left(1-\frac1{p^2}\right) + \text{ terms correcting dependency relations}.

The product above is exactly 1/\zeta(2) and it is natural to conjecture that the remaining terms are negligible (although the direct proof of this might be quite nasty).


Heuristic 2. Suppose that P is the probability in question. Then for a positive integer d, the probability that two randomly chosen positive integers have \gcd=d is P/d^2 since 1/d^2 is the probability that both numbers are divisible by d and P is the probability that the remaining factors are coprime. But the sum of all those probabilities must be equal to 1 (as any two randomly chosen positive integers have a \gcd), so

\displaystyle 1 = \sum_{d=1}^\infty \frac{P}{d^2} = P\cdot \zeta(2).

A general problem with these heuristic approaches is among others that there is nothing like a uniform probabilistic distribution over \mathbb{Z}^+. The second approach seems rather more likely to be rigorizable, I will however give a proof with a completely different approach here.

Proof. First, notice that

\displaystyle P_n = \frac{1}{n^2}\left(2 \left(\sum_{k=1}^{n} \varphi(n)\right) -1\right),

where \varphi denotes the Euler’s totient function. Recall that

\displaystyle \varphi(k) = \sum_{d|k}\mu(d)\frac{k}{d},

where \mu denotes the Moebius function, so

\displaystyle  P_n = -\frac{1}{n^2}+\frac{2}{n^2} \sum_{k=1}^n\sum_{d|k}\mu(d) \frac{k}{d}
 \displaystyle = -\frac{1}{n^2}+ \frac{2}{n^2} \sum_{dk' \leq n} \mu(d)k'
       \displaystyle  =  -\frac{1}{n^2}+\frac{2}{n^2} \sum_{d=1}^n \mu(d) \sum_{k'=1}^{\lfloor{n/d}\rfloor}k'
                    \displaystyle =  -\frac{1}{n^2}+ \frac{1}{n^2} \sum_{d=1}^n \mu(d) \left\lfloor \frac nd \right\rfloor \left(\left\lfloor\frac nd \right\rfloor+1\right)
                  \displaystyle =  -\frac{1}{n^2}+ \frac{1}{n^2} \sum_{d=1}^n \mu(d)\left( \frac{n^2}{d^2} + \mathcal O\left(\frac nd\right) \right)
                   \displaystyle =  \left(\sum_{d=1}^n\mu(d)\frac{1}{d^2}\right) + \mathcal O\left(\frac1n\sum_{d=1}^n \frac1d\right)  -\frac{1}{n^2}
          \displaystyle =  \sum_{d=1}^n\mu(d)\frac{1}{d^2} +\mathcal O\left(\frac{\log n}{n}\right)  -\frac{1}{n^2}.

Recall that for all real numbers r>1,

\displaystyle \sum_{k=1}^\infty \mu(k)\frac{1}{k^r} = \frac{1}{\zeta(r)}.


\displaystyle \lim_{n\rightarrow \infty} P_n = \frac{1}{\zeta(2)} = \frac{6}{\pi^2},

as required.


Some More Thoughts on r-Ary Functions over Finite Fields

March 20, 2009

In this note, I proved that every r-ary function on a finite field K with n elements can be represented by a unique polynomial P\in K[X_1,\ldots,X_r] with \deg_{X_i}(P) \leq n-1 for all i\in\{1,\ldots,r\}. A more direct way to see this is, given a function f:K^r\rightarrow K, to consider the polynomial

\displaystyle P(X_1,\ldots,X_r) = \sum_{a\in K^r} f(a)\left(1-\prod_{i=1}^r (X_i-a_i)^{n-1}\right).

Recall that

\displaystyle  (X_i-a_i)^{n-1} =	\begin{cases}	0&\text{if }X_i=a_i\\		1&\text{otherwise},	\end{cases}

so P(a)=f(a) for all a\in K^r.

Denote by P_f the polynomial defined above. We know that \deg(P_f)\leq r(n-1), but can we find the exact value? For this purpose, consider the following very useful lemma.

Lemma. Let R be an integral domain and H be a finite multiplicative subgroup of R^* with |H|=d. Then for every integer s,

\displaystyle  \sum_{x\in H} x^s = \begin{cases} d &\text{if } d|s\\	0 &\text{otherwise}. \end{cases}

Proof. It is immediate from Lagrange’s Theorem that \sum_{x\in H}x^s = d if d|s. Suppose that d\nmid s. Notice that H must necessarily be cyclic (as e.g. can easily be proved using the structure theorem of finitely generated Abelian groups) so in particular, there exists a \zeta\in H with \zeta^s \neq 1. Multiplication with \zeta permutes the elements of H, so

\displaystyle 0 = \sum_{x\in H} (\zeta x)^s - \sum_{x\in H} x^s = (\zeta^s-1)\sum_{x\in H} x^s.

But \zeta^s\neq 1 and R is an integral domain, so the result follows.


Corollary. If K is a finite field with n elements and s is an integer, then

\displaystyle\sum_{x\in K} x^s = \begin{cases}	d &\text{if } (n-1)|s \text{ and } s\neq 0\\	0 &\text{otherwise}.\end{cases}

Suppose now that

\displaystyle  P_f(X_1,\ldots,X_r) = \sum_{k\in S^r} a_k X_1^{k_1}\ldots X_r^{k_r},

where S=\{0,\ldots,n-1\}. For integers l_1,\ldots,l_r\in S, consider now the sum

\displaystyle \sum_{x\in K^r} x_1^{l_1}\ldots x_r^{l_r}P_f(x)	= \sum_{x\in K^r} \sum_{k\in S^r} a_k \prod_{i=1}^r x_i^{k_i+l_i}
                    \displaystyle = \sum_{k\in S^r} a_k \prod_{i=1}^r \sum_{x_i\in K} x_i^{k_i+l_i}.


\displaystyle  \sum_{x_i\in K} x_i^{k_i+l_i} = \begin{cases}n-1 &\text{if } (n-1)|(k_i+l_i) \text{ and } k_i+l_i>0 \\	0 &\text{otherwise}.	\end{cases}

We thus obtain

Theorem. Suppose that the coefficient of x_1^{t_1}\ldots x_r^{t_r} in P_f is nonzero. Then \deg(P_f)=t_1+\ldots+t_r if and only if

\displaystyle \sum_{x\in K^r} x_1^{l_1}\ldots x_r^{l_r} P_f(x) = 0

for all (l_1,\ldots,l_r)\in S^r with l_i < n-1-t_i for all i\in\{1,\ldots,r\}, and

\displaystyle  \sum_{x\in K^r} x_1^{n-1-t_1}\ldots x_r^{n-1-t_i} P_f(x) \neq 0,

where the latter sum must then be equal to a_{(t_1,\ldots,t_r)}(n-1)^r.

Hamiltonian Path Exchanges

March 7, 2009

The following nice little trick was shown to me by Stephan Wagner some time ago.

Theorem. Let G=(V,E) be a graph and let \Psi=\{v\in V \mid \deg_G v \equiv 0 \pmod2 \} be the set of all vertices of even degree. Then for each v\in V, the number of Hamiltonian paths from v to some vertex in \Psi is even.

Proof. For a Hamiltonian path p=v_1\ldots v_n and a vertex v_j\in V with v_jv_n\in E\backslash E(p), let S(p, v_j) be the Hamiltonian path v_1\ldots v_j v_n v_{n-1}\ldots v_{j+1}. We call the transformation p\mapsto S(p,v_j) the path exchange of p wrt v_j. Apparently, if p'=S(p,v_j), then conversely p=S(p',v_j).

For a fixed v\in V, let P(v) be the set of all Hamiltonian paths starting in v. Draw a graph G' (called the v-path graph of G) with vertices P(v), where two Hamiltonian paths p,p'\in P(v) are connected in G' iff p'=S(p,v_j) for some v_j\in E\backslash E(p). For each p\in P(v), the degree of p in G' is \deg_G(v')-1, where v' is the end vertex of p other than v. In particular, \deg_G(v') and \deg_{G'}(p) are of different parity. But in every graph, the number of vertices of odd degree is even. This proves the claim.


Corollary. Let G=(V,E) be a graph with \deg_G(v) being odd for all v\in V. Then for each e\in E, the number of Hamiltonian cycles passing through e is even. In particular, if in addition G is Hamiltonian, then it has at least two distinct Hamiltonian cycles.

Remark. The special case of the above corollary for cubic graphs is known as Smith’s Theorem. However, the problem of finding a second Hamiltonian cycle (in polynomial time) in a cubic graph when given already one is still an open question in complexity theory.

Some Remarks on K^r-Annihilating Polynomials over Finite Fields

February 11, 2009

We consider a finite field K with n elements and the polynomial ring R=K[X_1,\ldots,X_r]. Let’s investigate how polynomials P\in R with P(x)=0 for all x\in K^r (i.e. polynomials annihilating K^r) look like. First, recall the following theorem (which I, as a former participant of the IMO in 2007, should probably be familiar with):

Theorem (Combinatorial Nullstellensatz). Let K be an arbitrary field and P\in K[X_1,\ldots,X_r] be a polynomial with \deg P = t_1+\cdots+t_r, where t_1,\ldots,t_r are nonnegative integers such that coefficient of x_1^{t_1}\ldots x_r^{t_r} in P is nonzero. Suppose that S_1,\ldots, S_r are subsets of K with |S_i|>t_i for all i=1,\ldots,r. Then there exist s_1\in S_1,\ldots,s_r\in S_r so that P(s_1,\ldots,s_r)\neq 0.

This theorem immediately implies that if P\in R is a nonconstant polynomial annihilating K^r, then \deg_{X_i}P\geq n for some i (there are of course other nice proofs of that result but I somehow like it how CNS trivialises it). Furthermore, we can infer from this that R contains exactly n^{n^r} elements when considered as functions K^r\rightarrow K. Indeed, there are at most n^{n^r} polynomial functions since there are exactly n^{n^r} functions K^r \rightarrow K and two distinct polynomials P,Q\in R with \deg_{X_i}P, \deg_{X_i}Q < n for all i=1,\ldots,r (of which there are exactly n^{n^r}) cannot be equal on the whole of K^r by the above. We therfore immediately see that every function K^r\rightarrow K is equal to a polynomial function.

Also, an immediate inductions on the degree of the polynomials show that the Ideal I=\{P\in R \mid P(x)=0\text{ for all } x\in K^r\} of polynomials annihilating K^r is generated by the polynomials \left(X_i^n-X_i\right)_{i=1,\ldots,r}.

All Groups of Order n are cyclic iff…

January 22, 2009

Many months ago, I found a very interesting group theory problem (the if-part of the Proposition below) in Lee Zhuo Zhao‘s \LaTeX-signature on Facebook. My first pathetic attempts to solve that problem failed, apparently it required deeper knowledge of group theory than I had at that time. But even after reading some introductory lecture notes on that topic, it took me ages until I succeeded in solving this problem. It wasn’t too difficult then to see that the conclusion is also reversible.

I will prove the following result:

Proposition. Let n be a positive integer. Then every finite group of order n is cyclic if and only if

\displaystyle \gcd(n,\varphi(n))=1,

where \varphi denotes the Euler’s Totient Function.

I will use the following notation: If G is a group and x\in G, then C_G(x) denotes the centraliser of x wrt to G and \langle x\rangle denotes the cyclic group generated by x.

Proof of the if-part. Let n be a positive integer satisfying \gcd(n,\varphi(n))=1.
Note that using the canonical formula for \varphi(n), it immediately follows that n is squarefree.

We will use induction on n. The claim is trivially true if n is a prime number. Suppose now that for all positive integers n'<n with \gcd(n', \varphi(n'))=1, all groups of order n' are cyclic.

Let G be a group of order n. We have to prove that G is cyclic. From the induction hypothesis and Lagrange’s theorem, it follows that all proper subgroups of G are cyclic.

Lemma 1. Any two different elements of a cyclic subgroup of G are not conjugate.
Proof. Let \langle \zeta \rangle be a cyclic subgroup of G. Suppose that g\zeta^k g^{-1}= \zeta^l for integers k, l and some g \in G. We claim that \zeta^k=\zeta^l.
An immediate induction on d shows that

\displaystyle  \zeta^{l^d} = g^d\zeta^{k^d}g^{-d}

holds for all positive integers d. Putting d=\mathrm{ord}_G(g), we see that

\displaystyle \zeta^{k^d-l^d}=1,

so m := \mathrm{ord}_G(\zeta) divides k^d-l^d. Clearly, m and d divide n. Furthermore, m is squarefree.

Let p be a prime divisor of m. Then p|k if and only if p|l. If p|k and p|l, then clearly p|(k-l).
Otherwise, if k and l are not divisible by p, then l has a multiplicative inverse l^{-1} modulo p, so \left(l^{-1}k\right)^d\equiv 1\pmod p. Let d'=\mathrm{ord}_p\left( l^{-1}k\right). Then d'|d, so d'|n. Also, d'|(p-1) since \left(l^{-1}k\right)^{p-1}\equiv 1\pmod p by Fermat’s little theorem. But (p-1)|\varphi(n), so d'|\varphi(n). It follows that d' is a common divisor of n and \varphi(n), so d'=1. Hence, l^{-1}k\equiv 1\pmod p, so p|(k-l).

Thus, we see that p|(k-l) for all prime divisors p of m and since m is squarefree, it follows that m|(k-l). But then \zeta^k=\zeta^l, as required.


Next, we prove that the center of G is not trivial. Hence, assume for the sake of contradiction that the center Z of G is trivial.

Then for each x\in G with x\neq 1, C_G(x) is a proper subgroup of G. Furthermore, this subgroup is maximal, for if \langle \zeta \rangle is a maximal proper subgroup containing x (we can assume it to be cyclic by the induction hypothesis), then \langle \zeta \rangle \leq C_G(\zeta) \leq C_G(x)\lneqq G, so \langle \zeta \rangle = C_G(\zeta) = C_G(x).

It follows that

Lemma 2. For each x\in G with x\neq 1, C_G(x) is the unique maximal proper subgroup of G containing x.

It is well known that if \langle \zeta\rangle is a cylic group of order k and d is a divisor of k then \langle\zeta\rangle contains a unique subgroup of order d, namely \langle \zeta^{k/d}\rangle.

Lemma 3. Let \langle \zeta\rangle and \langle \eta\rangle be two maximal proper subgroups of G and let d_1 = \mathrm{ord}_G(\zeta) and d_2=\mathrm{ord}_G(\eta). Suppose that \gcd(d_1,d_2)>1. Then \langle \zeta \rangle and \langle\eta\rangle are conjugate. In particular, d_1=d_2.
Proof. Let p be a common prime divisor of d_1 and d_2. Then \langle \zeta\rangle and \langle\eta\rangle both contain a unique subgroup \langle x\rangle \leq \langle \zeta\rangle and \langle y\rangle \leq \langle\eta\rangle of order p respectively. It follows from Lemma 2 that C_G(x)=\langle \zeta\rangle and C_G(y)=\langle \eta\rangle.

Furthermore, it follows from Sylow’s second theorem that \langle x\rangle and \langle y\rangle are conjugate, so \langle y\rangle= g\langle x\rangle g^{-1} for some g \in G. Thus, gxg^{-1} is a generator of \langle y\rangle, wlog assume that gxg^{-1}=y. Now,

\displaystyle  \qquad \ \ \ \ h\in C_G(y)\\	\Leftrightarrow \quad hyh^{-1} = y \\	\Leftrightarrow \quad hgxg^{-1}h^{-1} = gxg^{-1} \\	\Leftrightarrow \quad g^{-1}hgxg^{-1}h^{-1}g = x \\	\Leftrightarrow \quad g^{-1}hg \in C_G(x) \\	\Leftrightarrow \quad h \in gC_G(x)g^{-1}.

It follows that C_G(y) = gC_G(x)g^{-1}, so \langle \eta\rangle = g\langle \zeta\rangle g^{-1}.


Now, since the center of G is trivial, follows from the class equation that

\displaystyle  n = 1 + \sum_{i=1}^r \frac{n}{|C_G(x_i)|},      (1)

where x_1,\ldots,x_r are representatives of the different nontrivial conjugacy classes. Each of the C_G(x_i) are maximal proper subgroups, the order of each two of them being either equal or coprime (Lemma 3). Furthermore, it is clear that for every prime divisor p of n, there exists an x_i so that p divides |C_G(x_i)|. Hence, we can write n as n=n_1\ldots n_k, where each of the n_j is the order of some of the C_G(x_i). Notice that k>1 and n_j > 1 for all j=1,\ldots,k. Every summand in (1) is of the form n/n_j for some j. Furthermore, for each x_i, each of the elements of C_G(x_i) belong to different conjugacy classes by Lemma 1. Moreover, for each x\in C_G(x_i) with x\neq 1, we have that C_G(x)=C_G(x_i) since C_G(x_i) is a maximal proper subgroup of G containing x, which is unique by Lemma 2. It follows that in the sum in (1), the summand n/n_j appears at least n_j-1 times for all j=1,\ldots,k. Hence,

\displaystyle n \geq 1 + \sum_{j=1}^k \frac{n}{n_j}(n_j-1),


\displaystyle n\left(1+\sum_{j=1}^k \frac{1}{n_j}\right) \geq 1+k n.


\displaystyle 1+\sum_{j=1}^k \frac{1}{n_j} > k

and since n_j\geq 2 for all j=1,\ldots,k,

\displaystyle 1+ \frac{k}{2} > k,

so k < 2, which is the desired contradiction.

It thus follows that the center Z of G is not trivial. Let d_1=|Z| and d_2=|G/Z|. Then d_1d_2=n and 1<d_1,d_2<n, so by the induction hypothesis, both Z and G/Z are cyclic. Let \zeta be a generator of Z and gZ be a generator of G/Z. Let d=\mathrm{ord}_G(g). Then (gZ)^d=g^dZ=Z, so d_2|d. Let x=\zeta^d. Then \mathrm{ord}_G(x) = d_1/\gcd(d_1, d) = n/d.

Assume that \mathrm{ord}_G(gx) < n. Then \mathrm{ord}_G(gx) divides n/p for some prime divisor p of n. Since x\in Z,

\displaystyle  (gx)^{n/p} = g^{n/p}x^{n/p}.

But \mathrm{ord}_G(x)\cdot\mathrm{ord}_G(g)=n and n is squarefree, so n/p is divisible by exactly one of the numbers \mathrm{ord}_G(x) and \mathrm{ord}_G(g). But then, either

\displaystyle  (gx)^{n/p} = g^{n/p}\neq 1    or     (gx)^{n/p} = x^{n/p}\neq 1,

which is a contradiction. Hence \mathrm{ord}_G(gx)=n, so G is cyclic, as required.


Proof of the only if-part. Let n be a positive integer satisfying \gcd(n,\varphi(n))>1. We have to show that there exists a finite group G of order n that is not cyclic. The result is obvious if n is not squarefree, because if p is a multiple prime divisor of n, then the group \mathbb{Z}_p \times \mathbb{Z}_{n/p} is not cyclic.

Suppose now that n is squarefree and \gcd(n, \varphi(n))>1. Then there exist prime divisors p and q of n so that q-1 is divisible by p.

Let g be a primitive root modulo q. We consider the set

\displaystyle  H=\left\{ \sigma: \mathbb{Z}_q\rightarrow \mathbb{Z}_q, x\mapsto g^{\frac{q-1}{p}k}x+\lambda \mid k,\lambda\in\mathbb{Z} \right\}.

Clearly, |H|=pq since g^{\frac{q-1}{p}k_1}x+\lambda_1 \equiv g^{\frac{q-1}{p}k_2}x+\lambda_2 \pmod q for all x\in \mathbb{Z} if and only if k_1\equiv k_2\pmod p and \lambda_1\equiv \lambda_2\pmod q.

Furthermore, H is a subgroup of the symmetric group S(\mathbb{Z}_q) (the subgroup conditions can readily be verified). Let \sigma_1\in H send x to x+1 and \sigma_2\in H send x to g^{(q-1)/p}x. Then \sigma_1\circ\sigma_2 sends x to g^{(q-1)/p}x + 1 and \sigma_2\circ\sigma_1 sends x to g^{(q-1)/p}x + g^{(q-1)/p}, so H is not abelian.

Consider now the group G= H \times \mathbb{Z}_{n/(pq)}. Since |H|=pq, |G|=n. But H is not abelian, so G is not abelian. In particular, G is not cyclic. This completes the proof.


Remark. For p=2, the group H constructed in the proof of the “only if”-part above is isomorphic to the dihedral group D_q of a regular q-gon.

Further Remark. I have just noticed that the group H is acutally isomorphic to the semidirect product \mathbb{Z}_q \times_{\theta} \mathbb{Z}_p wrt to the homomorphism

\theta: {\mathbb{Z}_p} \rightarrow \mathrm{Aut}({\mathbb{Z}_q}),
k \mapsto (x\mapsto g^{\frac{q-1}{p}k}x).

Bezout’s Lemma in Endomorphism Rings of Vector Spaces

January 19, 2009

The well known Bezout’s Lemma states that for all integers a,b,d, there exist integers x,y such that xa+yb=d if and only if \gcd(a,b)|d. It is also well known that this result is also true in principal ideal rings.

Some time ago, I found the following problem on a problem sheet.

Problem. Let A and B be n\times n matrices over \mathbb C. Prove that there exist complex n\times n matrices X,Y such that

\displaystyle  XA+YB=I_n

if and only if there does not exist a vector v\in\mathbb{C}^n\backslash \{0\} with Av=Bv=0, where I_n is the n\times n identity matrix.

One cannot help noticing the apparent relation of this problem to Bezout’s Lemma which however can obviously not be applied directly as \mathbb{C}^{n\times n} is not commutative (and thus not a principal ideal ring) but we will see that commutativity is acutally not required for Bezout. In fact, it is sufficient that every finitely generated left ideal is principal (or rather every right ideal, if the linear combinations in question are supposed to be from the right).

In course of thinking about the problem above, I discovered that in ({\bf L}(V,V),+,\circ) , every finitely generated left ideal is principal and that the generating elements are actually “easy” to describe. Hereby, V is a vector space over an arbitrary field K and {\bf L}(V,V) denotes the set of all V-endomorphisms.

I must at this point admit that my knowledge in linear algebra is actually quite moderate (I’m not at university yet). If someone sees things in a broader light and recognises these results as well known or as special cases/corollaries of more general results (I’m quite sure they are), I’d be glad about feedback.

Theorem. Let V be a (not necessarily finite-dimensional) vector space over a field K and let

\displaystyle I=\left\{\sum_{i=1}^k x_i\circ g_i \mid x_i\in {\bf L}(V,V)\right\}

be an left ideal in ({\bf L}(V,V),+,\circ), generated by g_1,\ldots,g_k\in {\bf L}(V,V). Then I is principal and a generating element of I can be described as follows. Define

\displaystyle  \ker( I) := \bigcap_{i=1}^k \ker(g_i).

Then any g\in {\bf L}(V,V) with \ker(g)=\ker(I) (e.g. a projection on a complement of V along \ker(I)) generates I.

Proof. First, it is easy to see that using induction, it is sufficient to prove just the case for k=2.

Let g\in {\bf L}(V,V) be any V-endomorphism with \ker(g)=\ker(I)=\ker(g_1)\cap \ker(g_2). Let (b_i)_{i\in I} be a base of \ker(g) which extends to a base (b_i)_{i\in I\cup I_1} of \ker(g_1) and a base (b_i)_{i\in I_1\cup I_2} of \ker(g_2). Let U_1 = \langle (b_i)_{i\in I_1}\rangle and U_2 = \langle (b_i)_{i\in I_2}\rangle. Clearly, U_1\cap U_2 = {0} and (b_i)_{i\in I\cup I_1\cup I_2} extends to a base (b_i)_{i\in I\cup I_1\cup I_2 \cup J} of V=\ker(g)\oplus U_1\oplus U_2 \oplus U, where U=\langle (b_i)_{i\in J}\rangle. Then (g_1(b_i))_{i\in I_2\cup J} is a base of g_1(V) and (g_2(b_i))_{i\in I_1} is a family of linear independent vectors. Let C_1 and C_2 be complements of g_1(V) and \langle (g_2(b_i))_{i\in I_1}\rangle to V respectively.

We now can define the V-endomorphisms f_1 by f_1(g_1(b_i))=g(b_i) for i\in I_2\cup J and f_1(x)=0 for all x\in C_1, and f_2 by f_2(g_2(b_i))=g(b_i) for all i\in I_1 and f_2(x)=0 for all x\in C_2. Note that f_1 and f_2 are well defined. Now, let u_1\in U_1, u_2\in U_2, u\in U. Then

f_1(g_1(u_1+u_2+u)) + f_2(g_2(u_1+u_2+u)) = g(u_2) + g(u) + g(u_1)
                = g(u_1+u_2+u),

so f_1\circ g_1 + f_2 \circ g_2 = g. It follows that g\in I.

In order to prove that g indeed generates I, it is sufficient to prove the following result: if f,g\in {\bf L}(V,V) and \ker(g) \leq \ker(f), then there exists a linear function h\in {\bf L}(V,V) such that f=h\circ g.

Let (b_i)_{i\in I_1} be a base of \ker(g). This extends to a base (b_i)_{i\in I_1\cup I_2} of \ker(f) and to a base (b_i)_{i\in I_1\cup I_2 \cup I_3} of V. Then (g(b_i))_{i\in I_2\cup I_3} is a base of g(V), which has a complement C to V. But then we can simply define a linear function h\in {\bf L}(V,V) by h(g(b_i)) = f(b_i) for i\in I_2\cup I_3 and h(x)=0 for x\in C. Note that h is well defined. But then, h(g(b_i))=f(b_i) for all i\in I_1\cup I_2\cup I_3 since h(g(b_i))=h(0)=0=f(b_i) for all i\in I_1. Thus, f=h\circ g.


Remark. In the case when V is finite-dimensional over K, we can prove similar results for right ideals by using the transposes of linear maps. Moreover, it is easy to see that if V is finite-dimensional, then every left ideal of {\bf L}(V,V) is principal and generated by a V-endomorphism g which minimises \dim_K(\ker(g)). Indeed, if I is a left ideal of {\bf L}(V,V), g is such a V-endomorphism which minimises \dim_K(\ker(g)) and f is another arbitrary V-endomorphism, then the subideal of I generated by g and f is finitely generated, and thus generated by a single V-endomorphism g' satisfying \ker(g') = \ker(g)\cap \ker(f). But since \dim_K(\ker(g)) is minimal, it follows that \ker(g)=\ker(g'), implying \ker(g)\leq \ker(f). Hence, there exists a V-endomorphism h such that f=h\circ g, so g generates I.

We therefore see that if \dim_K(V) < \infty, then {\bf L}(V,V) is, in some sense, a non-commutative principal ideal ring.

Corollary. Let V be a vector space over K and let g_1,\ldots,g_k\in {\bf L}(V,V). Then there exist x_1,\ldots,x_k\in {\bf L}(V,V) with

\displaystyle  \sum_{i=1}^k x_i\circ g_i={\bf id}_V

if and only if \ker(g_1)\cap \cdots \cap\ker(g_k) = \{0\}.

Furthermore, if V is finite-dimensional, then there exist x_1,\ldots,x_k\in {\bf L}(V,V) such that

\displaystyle  \sum_{i=1}^k g_i\circ x_i = {\bf id}_V

if and only if \ker(g_1^\top)\cap \cdots \cap \ker(g_k^\top) = \{0\}, where g^\top denotes the transpose of g.

Algebraic Closures

January 18, 2009

Note: This entry contains some serious errors which I was unaware of when I wrote it in the first place. Until I find out how to resolve them, I leave it to the eager reader to find them 😉

Some days ago, I was told that every field has an algebraic closure which is unique except of isomorphism (apparently, this seems to be a well known result). However, I wasn’t able to prove these results without using Zorn’s Lemma (neither the existence nor the uniqueness).

Definition. Let K be a field. An Algebraic Closure of K is a field L so that L:K is an algebraic field extension and L is algebraically closed, i.e. every polynomial with coefficients in L splits into linear factors in L.

We first prove with Zorn’s Lemma that every field has an algebraic closure. Note that the union of the fields in a chain (ordered by set inclusion) is itself a field.

Theorem. Every field K has an algebraic closure.
Proof. Let P be the set of all fields L so that L:K is an algebraic field extension. Then P is partially ordered by set inclusion.

Let T be a chain in P. Consider S=\bigcup_{L\in T} L. Then clearly, S is a field containing K.

Moreover, suppose that \alpha\in S. Then there exists some L\in T so that \alpha\in L. Hence, \alpha is algebraic over K as L is an algebraic extension of K. We therefore see that S:K is an algebraic extension.

It follows that every chain in P has an upper bound in P, so P has a maximal element M by Zorn’s Lemma. We shall prove that M is the required algebraic closure.

Suppose then that f\in M[X] is nonconstant and irreducible. Let \alpha be a root of f in some splitting field and let L=M(\alpha)\simeq M/(f). Then L:M is a finite and hence algebraic field extension. But M:K is also an algebraic field extension, so L:K is algebraic, contradicting the maximality of M.


Next, we shall prove that the algebraic closure is unique up to isomorphism. We will use the following (well-known) Lemma.

Lemma. Let \sigma: K_1\rightarrow K_2 be an isomorphism between fields. Let f_1 be a polynomial with coefficients in K_1 and f_2 be the polynomial obtained by applying \sigma to the coefficients of f_1. Let L_1 and L_2 be splitting fields of f_1 and f_2 over K_1 and K_2 respectively. Then there is an isomorphism \tau:L_1\rightarrow L_2 extending \sigma.

Theorem. Let L_1 and L_2 be algebraic closures of a field K. Then L_1\simeq L_2.
Proof. Consider the set P of isomorphisms \sigma: M_1\rightarrow M_2, where K\leq M_1\leq L_1 and K\leq M_2\leq L_2. Define a partial \sqsubseteq order on P where \sigma_1\sqsubseteq\sigma_2 iff \sigma_2 extends \sigma_1. Denote by \mathrm{D}(\sigma) and \mathrm{Im}(\sigma) the domain and image of \sigma respectively.

Suppose that T is a chain in P. Define a function \psi: \bigcup_{\sigma\in T} \mathrm{D}(\sigma) \rightarrow \bigcup_{\sigma\in T} \mathrm{Im}(\sigma), x\mapsto \sigma(x) for some \sigma\in T with x\in\mathrm{D}(\sigma).

Note that \psi is well-defined, since T is a chain wrt \sqsubseteq. Furthermore, \mathrm{D}(\sigma) and \mathrm{Im}(\sigma) are both fields as they are the union of chains of fields (wrt set inclusion) respectively, so \psi is a homomorphism between fields. Since \sigma is non-trivial, it is injective. Moreover, for each x\in \bigcup_{\sigma\in T}\mathrm{Im}(\sigma), there is some \sigma\in T with x\in \mathrm{Im}(\sigma), so \psi is surjective. It follows that \psi\in P is an upper bound of T.

It follows from Zorn’s Lemma that that P has a maximal element \tau. We shall prove that \tau is an isomorphism between L_1 and L_2. Let M_1=\mathrm{D}(\tau) and M_2=\mathrm{Im}(\tau). Assume that M_1 \lneqq L_1 and \alpha is some element in L_1\backslash M_1. Clearly, \alpha is algebraic over K and thus over M_1. Suppose m_\alpha is the minimal polynomial of \alpha over M_1 and n_\alpha be the polynomial obtained from m_\alpha by applying \tau to the coefficients of m_\alpha and let S_1 and S_2 be splitting fields of m_\alpha and n_\alpha over M_1 and M_2 respectively. Then S_1\leq L_1 and S_2\leq L_2. But by the Lemma above, there exists an isomorphism from S_1 to S_2 extending \tau, which contradicts the maximality of \tau. Thus, M_1=L_1. Using the same argument for \tau^{-1}, we see that M_2=L_2. Hence, L_1\simeq L_2, as required.


Remark. It is easy to see that the algebraic closure of a field K can also be defined as the smallest algebraically closed extension field of K, i.e. the intersection of all algebraically closed field extensions of K (which, itself is an algebraically closed field extension of K).
Indeed, let M be the algebraic closure of K and let L be the smallest algebraically closed field extension of K. Then clearly, L\leq M. Hence, every \alpha\in L is algebraic over K, so L is an algebraic field extension. But then, L is an algebraic closure and we have already seen that algebraic closures are unique up to isomorphism. It follows that L=M

Yimin Ge’s Blog Goes Online

January 17, 2009

This is yet another attempt to prevent my mathematical activities from collapsing during the nine months of my tedious civilian service. I will occasionally post some interesting problems or results that I discovered or just any random thoughts that pop up in my mind here which are not worth being mentioned in a paper.