Definition and Theorems (Chapter 6)

Definition. Let V be a vector space over the field F and let T be a linear operator on V. A characteristic value of T is a scalar c in F such that there is a non-zero vector \alpha in V with T\alpha=c\alpha. If c is a characteristic value of T, then
( a ) any \alpha such that T\alpha=c\alpha is called a characteristic vector of T associated with the characteristic value c;
( b ) the collection of all \alpha such that T\alpha=c\alpha is called the characteristic space associated with c.

Theorem 1. Let T be a linear operator on a finite-dimensional space V and let c be a scalar. The following are equivalent.
(i) c is a characteristic value of T.
(ii) The operator (T-cI) is singular (not invertible).
(iii) \det (T-cI)=0.

Definition. If A is an n\times n matrix over the field F, a characteristic value of A in F is a scalar c in F such that the matrix (A-cI) is singular (not invertible).
The polynomial f=\det (xI-A) is called the characteristic polynomial of A.

Lemma. Similar matrices have the same characteristic polynomial.

Definition. Let T be a linear operator on the finite-dimensional space V. We say that T is diagonalizable if there is a basis for V each vector of which is a characteristic vector of T.

Lemma. Suppose that T\alpha=c\alpha. If f is any polynomial, then f(T)\alpha=f(c)\alpha.

Lemma. Let T be a linear operator on the finite-dimensional space V. Let c_1,\dots,c_k be the distinct characteristic values of T and let W_i be the space of characteristic vectors associated with the characteristic value c_i. If W=W_1+\cdots+W_k, then

\displaystyle{\dim W=\dim W_1+\cdots+\dim W_k}

In fact, if \mathfrak B_i is an ordered basis for W_i, then \mathfrak B=(\mathfrak B_1,\dots,\mathfrak B_k) is an ordered basis for W.

Theorem 2. Let T be a linear operator on a finite-dimensional space V. Let c_1,\dots,c_k be the distinct characteristic values of T and let W_i be the null space of (T-c_iI). The following are equivalent.
(i) T is diagonalizable.
(ii) The characterristic polynomial for T is

\displaystyle{f=(x-c_1)^{d_1}\cdots (x-c_k)^{d_k}}

and \dim W_i=d_i,i=1,\dots,k.
(iii) \dim W_1+\cdots+\dim W_k=\dim V.

Definition. Let T be a linear operator on a finite-dimensional vector space V over the field F. The minimal polynomial for T is the (unique) monic generator of the ideal of polynomials over F which annihilate T.
If A is an n\times n matrix over F, we define the minimal polynomial for A as the unique monic generator of the ideal of all polynomials over F which annihilate A.

Theorem 3. Let T be a linear operator on an n-dimensional vector space V [or, let A be an n\times n matrix]. The characteristic and minimal polynomials for T [for A] have the same roots, except for multiplicities.

Theorem 4 (Cayley-Hamiltion). Let T be a linear operator on a finite dimensional vector space V. If f is the characteristic polynomial for T, then f(T)=0; in other words, the minimal polynomial divides the characteristic polynomial for T.

Definition. Let V be a vector space and T a linear operator on V. If W is a subspace of V, we say that W is **invariant under ** T if for each vector \alpha in W the vector T\alpha is in W, i.e., if T(W) is contained in W.

When W is invariant under T, T induces a linear operator T_W on the space W by T_W(\alpha)=T(\alpha).

Lemma. Let W be an invariant subspace for T. The characteristic polynomial for the restriction operator T_W divides the characteristic polynomial for T. The minimal polynomial for T_W divides the minimal polynomial for T.

Definition. Let W be an invariant subspace for T and let \alpha be a vector in V. The Tconductor of \alpha into W is the set S_T(\alpha; W), which consists of all polynomials g (over the scalar field) such that g(T)\alpha is in W. In the special case W=\{0\} the conductor is called the Tannihilator of \alpha.

Lemma. If W is an invariant subspace for T, then W is invariant under every polynomial in T. Thus, for each \alpha in V, the conductor S(\alpha;W) is an ideal in the polynomial algebra F[x].

The unique monic generator of the ideal S(\alpha;W) is also called the Tconductor of \alpha into W (the Tannihilator in case W=\{0\}).
Every T-conductor divides the minimal polynomial of T.
The linear operator T is called triangulable if there is an ordered basis in which T is represented by a triangular matrix.

Lemma. Let V be a finite-dimensional vector space over the field F. Let T be a linear operator on V such that the minimal polynomial for T is a product of linear factors

\displaystyle{p=(x-c_1)^{r_1}\cdots(x-c_k)^{r_k}}

Let W be a proper (W\neq V) subspace of V which is invariant under T. There exists a vector \alpha in V such that
( a ) \alpha is not in W;
( b ) (T-cI)\alpha is in W, for some characteristic value c of the operator T.

Theorem 5. Let V be a finite-dimensional vector space over the field F, and let T be a linear operator on V. Then T is triangulable if and only if the minimal polynomial for T is a product of linear polynomials over F.
Corollary. Let F be an algebraically closed field, e.g., the complex number field. Every n\times n matrix over F is similar over F to a triangular matrix.

Theorem 6. Let V be a finite-dimensional vector space over the field F and let T be a linear operator on V. Then T is diagonalizable if and only if the mininal polynomial for T has the form

\displaystyle{p=(x-c_1)\cdots(x-c_k)}

where c_1,\cdots,c_k are distinct elements of F.

The suspace W is invariant under (the family of operators) \mathfrak F if W is invariant under each operator in \mathfrak F.

Lemma. Let \mathfrak F be a commuting family of triangulable linear operators on V. Let W be a proper subspace of V which is invariant under \mathfrak F. There exists a vector \alpha in V such that
( a ) \alpha is not in W;
( b ) for each T in \mathfrak F, the vector T\alpha is in the subspace spanned by \alpha and W.

Theorem 7. Let V be a finite-dimensional vector space over the field F. Let \mathfrak F be a commuting family of triangular linear operators on V. There exists an ordered basis for V such that every operator in \mathfrak F is represented by a triangular matrix in that basis.
Corollary. Let \mathfrak F be a commuting family of n\times n matrices over an algebraically closed field F. There exists a non-singular n\times n matrix P with entries in F such that P^{-1}AP is upper-triangular, for every matrix A in \mathfrak F.

Theorem 8. Let \mathfrak F be a commuting family of diagonalizable linear operators on the finite-dimensional vector space V. There exists an ordered basis for V such that every operator in \mathfrak F is represented in that basis by a diagonal matrix.

Definition. Let W_1,\dots,W_k be subspaces of the vector space V. We say that W_1,\dots,W_k are independent if
\displaystyle{\alpha_1+\cdots+\alpha_k=0,\quad \alpha_i\in W_i}
implies that each \alpha_i=0.

Lemma. Let V be a finite-dimensional vector space. Let W_1,\dots,W_k be subspaces of V and let W=W_1+\cdots+W_k. The following are equivalent.
( a ) W_1,\dots,W_k are independent.
( b ) For each j,2\leq j\leq k, we have W_j\cap\{W_1,\dots,W_{j-1}\}=\{0\}.
( c ) If \mathfrak B_i is an ordered basis for W_i,1\leq i\leq k, then the sequence \mathfrak B=(\mathfrak B_1,\dots,\mathfrak B_k) is an ordered basis for W.
If any of the conditions holds, we say the sum W=W_1+\cdots+W_k is direct or that W is the direct sum of W_1,\dots,W_k and we write

\displaystyle{W=W_1\oplus\cdots\oplus W_k }

Definition. If V is a vector space, a projection of V is a linear operator E on V such that E^2=E.
If R and N are subspaces of V such that V=R\oplus N, there is one and only one projection operator E which has range R and null space N. That operator is called the projection on R along N.

Theorem 9. If V=W_1\oplus\cdots\oplus W_k, then there exist k linear operators E_1,\dots,E_k on V such that
(i) each E_i is a projection (E_i^2=E_i);
(ii) E_iE_j=0, if i\neq j;
(iii) I=E_1+\cdots+E_k;
(iv) the range of E_i is W_i.
Conversely, if E_1,\cdots,E_k are k linear operators on V which satisfy conditions (i),(ii) and (iii), and if we let W_i be the reange of E_i, then V=W_1\oplus\cdots\oplus W_k.

Theorem 10. Let T be a linear operator on the space V, and let W_1,\dots,W_k and E_1,\dots,E_k be as in Theorem 9. Then a necessary and sufficient condition that each subspace W_i be invariant under T is that T commute with each of the projections E_i, i.e.,

\displaystyle{TE_i=E_iT,\qquad i=1,\dots,k.}

Theorem 11. Let T be a linear operator on a finite-dimensional space V.
If T is diagonalizable and if c_1,\dots,c_k are the distinct characteristic values of T, then there exist linear operators E_1,\dots,E_k on V such that
(i) T=c_1E_1+\cdots+c_kE_k;
(ii) I=E_1+\cdots+E_k;
(iii) E_iE_j=0,i\neq j;
(iv) E_i^2=E_i (E_i is a projection);
(v) the range of E_i is the characteristic space for T associated with c_i.
Conversely, if there exist k distinct scalars c_1,\dots,c_k and k non-zero linear operators E_1,\dots,E_k which satisfy conditions (i),(ii) and (iii), then T is diagonalizable, c_1,\dots,c_k are the distinct characteristic values of T, and conditions (iv) and (v) are satisfied also.

If T=c_1E_1+\cdots+c_kE_k, then g(T)=g(c_1)E_1+\cdots+g(c_k)E_k. In particular, let p_j=\prod\limits_{i\neq j}\dfrac{(x-c_i)}{(c_j-c_i)}, then p_j(T)=E_j.

Theorem 12 (Primary Decomposition Theorem). Let T be a linear operator on the finite-dimensional vector space V over the field F. Let p be the minimal operator for T,

\displaystyle{p=p_1^{r_1}{\cdots}p_k^{r_k}}

where the p_i are distinct irreducible monic polynomials over F and the r_i are positive integers. Let W_i be the null space of p_i(T)^{r_i},i=1,\dots,k. Then
(i) V=W_1\oplus\cdots\oplus W_k;
(ii) each W_i is invariant under T;
(iii) if T_i is the operator induced on W_i by T, then the minimal polynomial for T_i is p_i^{r_i}.
Corollary. If E_1,\dots,E_k are the projections associated with the primary decomposition of T, then each E_i is a polynomial in T, and accordingly if a linear operator U commutes with T then U commutes with each of the E_i, i.e., each subspace W_i is invariant under U.

Definition. Let N be a linear operator on the vector space V. We say that N is nilpotent if there is some positive integer r such that N^r=0.

Theorem 13. Let T be a linear operator on the finite-dimensional vector space V over the field F. Suppose that the minimal polynomial for T decomposes over F into a product of linear polynomials. Then there is a diagonalizable operator D on V and a nilpotent operator N on V such that
(i) T=D+N,
(ii) DN=ND.
The diagonalizable operator D and the nilpotent operator N are uniquely determined by (i) and (ii) and each of them is a polynomial in T.

Corollary. Let V be a finite-dimensional vector space over an algebraically closed field F, e.g., the field of complex numbers. Then every linear operator T on V can be written as the sum of a diagonalizable operator D and a nilpotent operator N which commute. These operators D and N are unique and each is a polynomial in T.

Linear Algebra (2ed) Hoffman & Kunze 6.8

这个定理(Theorem 12)可能是非计算导向的线性代数课中比较重要的定理之一,也许比它重要的就是Spectral Theory了。这一定理从一个operator T的minimal polynomial出发,根据最小多项式的因式(primes)出发,得到一个对应的direct sum且invariant under T的对V的分解,并且在每一个分解出的子空间中,T在其上的限制得到的operator的最小多项式恰是对应的prime。也就是说,根据T最小多项式的因式情况,将V分解为由每个因式独立支配(作为minimal polynomial)的subspace。这一定理比之前的讨论更进一步,是因为其并不再要求minimal polynomial的因式都是1阶的,可以更高阶,之前在最小多项式因式都是1阶且次幂为1时,我们可以将T对角化,也就是将V分解为由特征向量组成的子空间,如果1阶但次幂不为1,则可以将T三角化,但无法得到direct sum的子空间。
在这一定理的证明过程中,可以得到与primary decomposition 对应的投影E_iT的一个多项式,因此如果UT可交换,则primary decomposition出的每个子空间在U下也是invariant的。
接下来讨论一个特殊情况,即T的minimal polynomial的因式都是1阶(此时相当于可三角化但不一定对角化),通过Theorem 13可知,T可以分解为一个可对角化的operator和一个幂零的operator的和,且这两个operator是可交换的、唯一的,并且都可表示为T的多项式。
实际上越读到后面越发现,多项式是一个神奇的东西。

Exercises

1.Let T be a linear operator on R^3 which is represented in the standard ordered basis by the matrix

\displaystyle{\begin{bmatrix}6&-3&-2\\4&-1&-2\\10&-5&-3\end{bmatrix}.}

Express the minimal polynomial p for T in the form p=p_1p_2, where p_1 and p_2 are monic and irreducible over the field of real numbers. Let W_i be the null space of p_i(T). Find basis \mathfrak B_i for the spaces W_1 and W_2. If T_i is the operator induced on W_i by T, find the matrix of T_i in the basis \mathfrak B_i.
Solution: We let A be the matrix above and then

\displaystyle{\begin{aligned}\det (xI-A)&=\begin{vmatrix}x-6&3&2\\-4&x+1&2\\-10&5&x+3\end{vmatrix}=\begin{vmatrix}x-2&2-x&0\\-4&x+1&2\\-10&5&x+3\end{vmatrix}\\&=\begin{vmatrix}x-2&0&0\\-4&x-3&2\\-10&-5&x+3\end{vmatrix}\\&=(x-2)(x^2+1)\end{aligned}}

Since A-2I\neq 0 and A^2+I\neq 0, the minimal polynomial of A is (x-2)(x^2+1), thus p_1=x-2 and p_2=x^2+1.
For W_1 we have

\displaystyle{2I-A=\begin{bmatrix}-4&3&2\\-4&3&2\\-10&5&5\end{bmatrix}\rightarrow \begin{bmatrix}-4&3&2\\-2&1&1\\0&0&0\end{bmatrix}\rightarrow \begin{bmatrix}-2&1&1\\0&1&0\\0&0&0\end{bmatrix}}

Thus we can let \mathfrak B_1=(1,0,2).
For W_2 we have

\displaystyle{A^2+I=\begin{bmatrix}6&-3&-2\\4&-1&-2\\10&-5&-3\end{bmatrix}\begin{bmatrix}6&-3&-2\\4&-1&-2\\10&-5&-3\end{bmatrix}+I=\begin{bmatrix}5&-5&0\\0&0&0\\10&-10&0\end{bmatrix}}

Thus we can let \mathfrak B_2=\{(1,1,0),(1,1,1)\}.
The matrix T_1 of \mathfrak B_1 is 2, since it is the subspace of characteristic vectors associated with the characteristic value 2, or a direct computation shows that

\displaystyle{A\begin{bmatrix}1\\0\\2\end{bmatrix}=\begin{bmatrix}6&-3&-2\\4&-1&-2\\10&-5&-3\end{bmatrix}\begin{bmatrix}1\\0\\2\end{bmatrix}=\begin{bmatrix}2\\0\\4\end{bmatrix}=2\begin{bmatrix}1\\0\\2\end{bmatrix}}

The matrix T_2 of \mathfrak B_2 can be computed with:

A\begin{bmatrix}1\\1\\0\end{bmatrix}=\begin{bmatrix}6&-3&-2\\4&-1&-2\\10&-5&-3\end{bmatrix}\begin{bmatrix}1\\1\\0\end{bmatrix}=\begin{bmatrix}3\\3\\5\end{bmatrix}=-2\begin{bmatrix}1\\1\\0\end{bmatrix}+5\begin{bmatrix}1\\1\\1\end{bmatrix} \\ A\begin{bmatrix}1\\1\\1\end{bmatrix}=\begin{bmatrix}6&-3&-2\\4&-1&-2\\10&-5&-3\end{bmatrix}\begin{bmatrix}1\\1\\1\end{bmatrix}=\begin{bmatrix}1\\1\\2\end{bmatrix}=-\begin{bmatrix}1\\1\\0\end{bmatrix}+2\begin{bmatrix}1\\1\\1\end{bmatrix}

Thus the matrix T_2 in the basis \mathfrak B_2 is \begin{bmatrix}-2&-1\\5&2\end{bmatrix}.

2.Let T be the linear operator on R^3 which is represented by the matrix

\displaystyle{\begin{bmatrix}3&1&-1\\2&2&-1\\2&2&0\end{bmatrix}}

in the standard ordered basis. Show that there is a diagonalizable operator D on R^3 and a nilpotent operator N on R^3 such that T=D+N and DN=ND. Find the matrices of D and N in the standard basis.
Solution: We let A be the matrix above and then

\displaystyle{\begin{aligned}\det (xI-A)&=\begin{vmatrix}x-3&-1&1\\-2&x-2&1\\-2&-2&x\end{vmatrix}=\begin{vmatrix}x-1&1-x&0\\-2&x-2&1\\-2&-2&x\end{vmatrix}=\begin{vmatrix}x-1&0&0\\-2&x-4&1\\-2&-4&x\end{vmatrix}\\&=(x-1)(x-2)^2\end{aligned}}

It is easy to prove the minimal polynomial of A is p=(x-1)(x-2)^2, so we let

\displaystyle{f_1=(x-2)^2,\quad f_2=(x-1)}

Let g_1=1,g_2=3-x and then f_1g_1+f_2g_2=1, so let E_1=(T-2I)^2,E_2=(T-I)(3I-T), then D=E_1+2E_2 and N=T-D, the matrix of D and N in the standard basis are

\begin{aligned} {[}D {]}&=(A-2I)^2-2(A-I)(A-3I)\\&=\begin{bmatrix}1&1&-1\\2&0&-1\\2&2&-2\end{bmatrix}\begin{bmatrix}1&1&-1\\2&0&-1\\2&2&-2\end{bmatrix}-2\begin{bmatrix}2&1&-1\\2&1&-1\\2&2&-1\end{bmatrix}\begin{bmatrix}0&1&-1\\2&-1&-1\\2&2&-3\end{bmatrix}\\&=\begin{bmatrix}1&-1&0\\0&0&0\\2&-2&0\end{bmatrix}-2\begin{bmatrix}0&-1&0\\0&-1&0\\2&-2&-1\end{bmatrix}=\begin{bmatrix}1&1&0\\0&2&0\\-2&2&2\end{bmatrix}\end{aligned}

[N]=A-[D]=\begin{bmatrix}3&1&-1\\2&2&-1\\2&2&0\end{bmatrix}-\begin{bmatrix}1&1&0\\0&2&0\\-2&2&2\end{bmatrix}=\begin{bmatrix}2&0&-1\\2&0&-1\\4&0&-2\end{bmatrix}

3.If V is the space of all polynomials of degree less than or equal to n over a field F, prove that the differentiation operator on V is nilpotent.
Solution: The set of polynomials \{p_i=x^i,i=0,1,\dots,n\} are a basis for V, and we have D^{n+1}p_i=0 for all i, thus D^{n+1}=0 on V.

4.Let T be a linear operator on the finite-dimensional space V with characteristic polynomial

\displaystyle{f=(x-c_1)^{d_1}\cdots(x-c_k)^{d_k}}

and minimal polynomial

\displaystyle{p=(x-c_1)^{r_1}\cdots(x-c_k)^{r_k}.}

Let W_i be the null space of (T-c_iI)^{r_i}.
( a ) Prove that W_i is the set of all vectors \alpha in V such that (T-c_iI)^m\alpha=0 for some positive integer m (which may depend upon \alpha).
( b ) Prove that the dimension of W_i is d_i.
Solution:
( a ) Let S=\{\alpha\in V:(T-c_iI)^m\alpha=0,m\in N^+\}. Obviously W_i\subseteq S, also if (T-c_iI)^m\alpha=0 for some m\leq r_i, then we have (T-c_iI)^{r_i}\alpha=0 and \alpha\in W_i. If (T-c_iI)^m\alpha=0 for some m> r_i, then (T-c_iI)^{m-r_i}\alpha\in W_i, according to the Primary Decomposition Theorem, there is a projection E_i with \text{range }E_i=W_i, so

\displaystyle{E_i(T-c_iI)^{m-r_i}\alpha=(T-c_iI)^{m-r_i}E_i\alpha=(T-c_iI)^{m-r_i}\alpha\implies(T-c_iI)^{m-r_i}(\alpha-E_i\alpha)=0}

Now if m-r_i\leq r_i, then we have \alpha-E_i\alpha\in W_i, if not, by the same method we can see (T-c_iI)^{m-2r_i}(\alpha-E_i\alpha)\in W_i, so

\displaystyle{(T-c_iI)^{m-2r_i}[(\alpha-E_i\alpha)-E_i(\alpha-E_i\alpha)]=(T-c_iI)^{m-2r_i}(\alpha-E_i\alpha)=0}

Repeating finite number of times we can get \alpha-E_i\alpha\in W_i, but this means E_i(\alpha-E_i\alpha)=0=\alpha-E_i\alpha or \alpha=E_i\alpha, thus \alpha\in W_i.
( b ) If T_i is the operator induceed on W_i by T, then T_i-c_iI is nilpotent by the definition of W_i, thus the minimal polynomial for T_i-c_iI divides x^{r_i}, which means the characteristic polynomial of T_i-c_iI is x^{e_i},e_i=\dim W_i. Thus \det(xI-(T_i-c_iI))=\det((x+c)I-T_i)=x^{e_i} means \det(xI-T_i)=(x-c)^{e_i}. As the characteristic polynomial for T is the product of the characteristic polynomials of all the T_i,i=1,\dots,k, we have

\displaystyle{(x-c_1)^{d_1}\cdots(x-c_k)^{d_k}=(x-c_1)^{e_1}\cdots(x-c_k)^{e_k}}

It follows that d_i=e_i.

5.Let V be a finite-dimensional vector space over the field of complex numbers. Let T be a linear operator on V and let D be the diagonalizable part of T. Prove that if g is any polynomial with complex coefficients, then the diagonalizable part of g(T) is g(D).
Solution: First if D is diagonalizable, we can find a basis \{\alpha_1,\dots,\alpha_n\} of V such that D\alpha_i=c_i\alpha_i for i=1,\dots,n. As g(D)\alpha_i=g(c_i)\alpha_i, we know g(D) is diagonalizable.
Notice that N=T-D is nilpotent, and thus we can find r such that N^r=0, now the polynomial operator N'=g(T)-g(D) can be written as g(D+N)-g(D), thus g(T)-g(D)=GN where G is some composite linear operator of D and N. Thus N'^r=G^rN^r=0 and so N' is nilpotent, the conclusion follows since the diagonalizable part of g(T) is unique.

6.Let V be a finite-dimensional vector space over the field F, and let T be a linear operator on V such that \text{rank }(T)=1. Prove that either T is diagonalizable or T is nilpotent, not both.
Solution: Let \alpha_1 be a basis for \text{range }T, and extend it to a basis \{\alpha_1,\dots,\alpha_n\} for V, then we have

\displaystyle{T\alpha_i=k_i\alpha_1,\quad i=1,\dots,n}

while at least one of k_i is not zero. If k_1=0, then T\alpha_1=0 and thus T^2\alpha_i=k_iT\alpha_1=0, which means T^2=0 and T is nilpotent. Also since T\neq 0 the minimal polynomial of T is x^2, thus by Theorem 6, T is not diagonalizable.
If k_1\neq 0, then (T-k_1I)T\alpha_j=0 for all j, which means the minimal polynomial of T is x(x-k_1), thus T is diagonalizable, also since T^n\alpha_1=k_1^n\alpha_1\neq 0 for all positive integers k, T is not nilpotent.

7.Let V be a finite-dimensional vector space over F, and let T be a linear operator on V. Suppose that T commutes with every diagonalizable linear operator on V. Prove that T is a scalar multiple of the identity operator.
Solution: Let U be the operator such that U\alpha_i=2^i\alpha_i for a basis \{\alpha_1,\dots,\alpha_n\} of V, then U is diagonalizable, and if W_i is the null space of (U-2^iI) we can see W_i is spanned by \alpha_i, now T commutes with U means W_i is invariant under T for all i, thus T\alpha_i=k_i\alpha_i for some k_i.
Notice in the process above we only require \{\alpha_1,\dots,\alpha_n\} to be a basis for V, so in fact any subspace spanned by a single vector in V is invariant under T. As we already have k_1,\dots,k_n, for any i\neq j we have T(\alpha_i+\alpha_j)=k(\alpha_i+\alpha_j)=k_i\alpha_i+k_j\alpha_j, which gives k_i=k_j, thus T is a scalar multiple of the identity operator.

8.Let V be the space of n\times n matrices over a field F, and let A be a fixed n\times n matrix over F. Define a linear operator T on V by T(B)=AB-BA. Prove that if A is a nilpotent matrix, then T is a nilpotent operator.
Solution: We shall prove by induction that T^k(B)=\sum_{i=0}^k(-1)^i\binom{k}{i}A^{k-i}BA^i.
When k=1 the statement is true. Suppose this holds for k, then

\displaystyle{\begin{aligned}T^{k+1}(B)&=T(T^k(B))=T\left(\sum_{i=0}^k(-1)^i\binom{k}{i}A^{k-i}BA^i\right)\\&=A\left(\sum_{i=0}^k(-1)^i\binom{k}{i}A^{k-i}BA^i\right)-\left(\sum_{i=0}^k(-1)^i\binom{k}{i}A^{k-i}BA^i\right)A\\&=\sum_{i=0}^k(-1)^i\binom{k}{i}A^{k+1-i}BA^i-\sum_{i=0}^k(-1)^i\binom{k}{i}A^{k-i}BA^{i+1}\\&=A^{k+1}B+\sum_{i=1}^k(-1)^i\binom{k}{i}A^{k+1-i}BA^i-\sum_{i=0}^{k-1}(-1)^i\binom{k}{i}A^{k-i}BA^{i+1}-(-1)^kBA^{k+1}\\&=A^{k+1}B+\sum_{i=1}^k\left[(-1)^i\binom{k}{i}A^{k+1-i}BA^i+(-1)^i\binom{k}{i-1}A^{k+1-i}BA^i\right]+(-1)^{k+1}BA^{k+1}\\&=A^{k+1}B+\sum_{i=1}^k\left[(-1)^i\binom{k+1}{i}A^{k+1-i}BA^i\right]+(-1)^{k+1}BA^{k+1}\\&=\sum_{i=0}^{k+1}(-1)^i\binom{k+1}{i}A^{k+1-i}BA^i\end{aligned}}

Thus if A is nilpotent, then there is positive integer r such that A^r=0, then T^{2r+1}=0, which means T is nilpotent.

9.Give an example of two 4\times 4 nilpotent matrices which have the same minimal polynomial (they necessarily have the same characteristic polynomial) but which are not similar.
Solution: We let

\displaystyle{M=\begin{bmatrix}0&1&0&0\\0&0&0&0\\0&0&0&1\\0&0&0&0\end{bmatrix},N=\begin{bmatrix}0&0&0&0\\0&0&1&0\\0&0&0&0\\0&0&0&0\end{bmatrix}}

then M^2=N^2=0, so M and N are nilpotent and the minimal polynomial of M and N are x^2. Assume there is P invertible such that P^{-1}MP=N, then MP=PN, which means

\displaystyle{\begin{bmatrix}0&1&0&0\\0&0&0&0\\0&0&0&1\\0&0&0&0\end{bmatrix}\begin{bmatrix}P_{11}&P_{12}&P_{13}&P_{14}\\P_{21}&P_{22}&P_{23}&P_{24}\\P_{31}&P_{32}&P_{33}&P_{34}\\P_{41}&P_{42}&P_{43}&P_{44}\end{bmatrix}=\begin{bmatrix}P_{11}&P_{12}&P_{13}&P_{14}\\P_{21}&P_{22}&P_{23}&P_{24}\\P_{31}&P_{32}&P_{33}&P_{34}\\P_{41}&P_{42}&P_{43}&P_{44}\end{bmatrix}\begin{bmatrix}0&0&0&0\\0&0&1&0\\0&0&0&0\\0&0&0&0\end{bmatrix}}
\displaystyle{\begin{bmatrix}P_{21}&P_{22}&P_{23}&P_{24}\\0&0&0&0\\P_{41}&P_{42}&P_{43}&P_{44}\\0&0&0&0\end{bmatrix}=\begin{bmatrix}0&0&P_{12}&0\\0&0&P_{22}&0\\0&0&P_{32}&0\\0&0&P_{42}&0\end{bmatrix}}

thus P is of the form

\displaystyle{P=\begin{bmatrix}P_{11}&P_{12}&P_{13}&P_{14}\\0&0&P_{23}&0\\P_{31}&P_{32}&P_{33}&P_{34}\\0&0&P_{43}&0\end{bmatrix}}

which means P is not invertible, a contradiction.

10.Let T be a linear operator on the finite-dimensional space V, let p=p_1^{r_1}{\cdots}p_k^{r_k} be the minimal polynomial for T, and let V=W_1\oplus\cdots\oplus W_k be the primary decomposition for T, i.e., W_j is the null space of p_j(T)^{r_j}. Let W be any subspace of V which is invariant under T. Prove that

\displaystyle{W=(W\cap W_1)\oplus(W\cap W_2)\oplus\cdots\oplus(W\cap W_k)}

Solution: Since W is invariant under T, we can define the operator T_W on W, and the minimal operator for T_W divides p, since p(T)=0 means p(T_W)=0 on W. If we write the minimal operator for T_W as p'=p_1^{s_1}{\cdots}p_k^{s_k} with s_i\leq r_i for i=1,\dots,k, then use the primary decomposition theorem we have W=W_1'\oplus\cdots\oplus W_k', where W_j' is the null space of p_j(T)^{s_j}, obviously W_j'\subseteq W_j, thus W\cap W_j=W_j' and the conclusion follows.

11.What is wrong with the following proof of Theorem 13? Suppose that the minimal polynomial for T is a product of linear factors. Then, by Theorem 5, T is triangulable. Let \mathfrak B be an ordered basis such that A=[T]_{\mathfrak B} is upper-triangular. Let D be the diagonal matrix with diagonal entries a_{11},\dots,a_{nn}. Then A=D+N, where N is strictly upper-triangular. Evidently N is nilpotent.
Solution: The problem is that we can not guarantee DN=ND, usually the two are not equal.

12.If you thought about Exercise 11, think about it again, after you observe what Theorem 7 tells you about the diagonalizable and nilpotent parts of T.
Solution: It is not quite clear what Theorem 7 means.

13.Let T be a linear operator on V with minimal polynomial of the form p^n, where p is irreducible over the scalar field. Show that there is a vector \alpha in V such that the T-annihilator of \alpha is p^n.
Solution: Since p^n is the minimal polynomial, p^{n-1}(T)\neq 0 and thus there is a vector \alpha \in V such that p^{n-1}(T)\alpha\neq 0, obviously p^n(T)\alpha=0, also the T-annihilator of \alpha divides p^n, which means it can only be p^n.

14.Use the primary decomposition theorem and the result of Exercise 13 to prove the following. If T is any linear operator on a finite-dimensional vector space V, then there is a vector \alpha\in V with T-annihilator equal to the minimal polynomial for T.
Solution: Let p=p_1^{r_1}\cdots p_k^{r_k} be the minimal polynomial for T, then by the primary decomposition theorem we can decompose V into V=W_1\oplus\cdots\oplus W_k, where each W_i is invariant under T and the minimal polynomial for T_i is p_i^{r_i}. So for each W_i there is a vector \alpha_i in W_i such that the T_i-annihilator of \alpha_i is p_i^{r_i}.
Let \alpha=\alpha_1+\cdots+\alpha_k, then p(T)\alpha=0 since each p_i^{r_i}(T) sends \alpha_i to zero. If there is any polynomial q which is the annihilator of \alpha and has degree less than \deg p, then since q must divides p, we have q=p_1^{s_1}\cdots p_k^{s_k}, where s_i\leq r_i and at least one s_j<r_j, now p_j^{s_j}(T)\alpha_j is not zero because the T_i-annihilator of \alpha_i is p_i^{r_i}, while q(T)\alpha=0 means q(T)\alpha_j=0 since each W_i is invariant under q(T), so p_j^{s_j}(T)\alpha_j\in W_j belongs to one null space of p_i^{s_i}(T), with i\neq j, which is W_i, but W_i\cap W_j=\{0\} and so p_j^{s_j}(T)\alpha_j=0, a contradiction.

15.If N is a nilpotent linear operator on an n-dimensional vector space V, then the characteristic polynomial for N is x^n.
Solution: This is in fact a corollary of the generalized Cayley-Hamiltion Theorem. One can search for Theorem 4 of Chapter 7 to find a proof. We can prove it now using Theorem 5. Since N is nilpotent, the minimal polynomial is of the form x^r for some r\geq 0. By Theorem 5 we can find an ordered basis \mathfrak B such that the matrix A of N is upper-triangular. It follows that A is nilpotent and the diagonal entries of A is all zero. Thus \det(xI-A)=x^n.

Linear Algebra (2ed) Hoffman & Kunze 6.7

这一节考虑direct sum中每一个子空间都是在T下invariant的情况,这种情况的好处是:T在每一个W_i上都可以限制为一个linear operator T_i。所以T可以看成是T_1,\dots,T_k的direct sum,即T的变化结果可以分解为在每个独立的T_iW_i上的变化结果。这一W_i如何得到,其实Theorem 9已经给出了,即通过投影E_i得到,但要保证invariant的附加条件,则需要(且只需要)T和每一个E_i可交换(Theorem 10)。
对于可对角化的矩阵,Theorem 11说明:如果不同特征值是c_1,\dots,c_k,那么有一个和对角化等价的很复杂的一系列关于projection E_1,\dots,E_k的结果,每一个E_i恰好是c_i的特征向量空间。这个定理很复杂且很难用语言描述,但其目的之一是使用抽象的只关于operator的代数性计算得到结果。
最后展示了T=c_1E_1+\cdots+c_kE_k这种decomposition的比较好的性质:可以直接套用多项式,即g(T)=g(c_1)E_1+\cdots+g(c_k)E_k,也揭示了Lagrange interpolation polynomial和E_i的关系。仅使用多项式和operator的性质,可以给Theorem 6一个新的证明(即可对角化和minimal polynomial之间的关系)。

Exercises

1.Let E be a projection of V and let T be a linear operator on V. Prove that the range of E is invariant under T if and only if ETE=TE. Prove that both the range and null space of E are invariant under T if and only if ET=TE.
Solution: If \text{range }E is invariant under T, then for any \alpha\in V,E\alpha\in \text{range }E, thus TE\alpha\in \text{range }E, then E(TE\alpha)=TE\alpha since E is a projection. Conversely, if ETE=TE, then for \beta\in \text{range }E, there is \alpha\in V such that \beta =E\alpha, which means T\beta=TE\alpha=ETE\alpha, thus T\beta is in \text{range }E.
Now if ET=TE, then ETE=EET=ET, thus the range of E is invariant under T by what has just proved. Consider the case of null space of E. Let \alpha\in \text{null }E, then E\alpha=0, thus 0=T(E\alpha)=ET\alpha, so T\alpha\in \text{null }E and \text{null }E is invariant under T.
Conversely, suppose \text{null }E and \text{range }E is invariant under T, then since V=\text{null }E\oplus \text{range }E, for any \alpha\in V we can write \alpha=\beta+\gamma, in which \beta\in \text{null }E and \gamma \in \text{range }E. Thus ET\alpha=ET\beta+ET\gamma, as \beta\in \text{null }E means T\beta\in \text{null }E, we have ET\beta=0, and ET\gamma=T\gamma=TE\gamma, notice that TE\beta=0, we have
\displaystyle{ET\alpha=TE\gamma=TE\gamma+TE\beta=TE\alpha}

2.Let T be the linear operator on R^2, the matrix of which in the standard ordered basis is \begin{bmatrix}2&1\\0&2\end{bmatrix}. Let W_1 be the subspace of R^2 spanned by the vector \epsilon_1=(1,0).
( a ) Prove that W_1 is invariant under T.
( b ) Prove that there is no subspace W_2 which is invariant under T and which is complementary to W_1: R^2=W_1\oplus W_2.

Solution:
( a ) From the matrix we know that T\epsilon_1=2\epsilon_1\in W_1.
( b ) Suppose there is a W_2 invariant under T which satisfies R^2=W_1\oplus W_2, then a basis of W_2 together with \epsilon_1 shall be a basis for R^2, thus W_2 is spanned by a single vector, let the basis of W_2 be a\epsilon_1+b\epsilon_2, we shall have b\neq 0, otherwise W_1\cap W_2\neq \{0\}. Since W_2 is invariant under T, we have

\displaystyle{T(a\epsilon_1+b\epsilon_2)=2a\epsilon_1+b(\epsilon_1+2\epsilon_2)=k(a\epsilon_1+b\epsilon_2)}

thus we have 2a+b=ka and 2b=kb, the second equality means k=2, then by the first equation we get b=0, a contradiction.

3.Let T be a linear operator on a finite-dimensional vector space V. Let R be the range of T and let N be the null space of T. Prove that R and N are independent if and only if V=R\oplus N.
Solution: If V=R\oplus N, then R and N are independent by definition. Convsersely, let W=R\oplus N, then a basis for R and N combined would be a basis of W, thus \dim W=\dim R+\dim N=\dim V, since W\subseteq V, we have W=V.

4.Let T be a linear operator on V. Suppose V=W_1\oplus\cdots\oplus W_k, where each W_i is invariant under T. Let T_i be the induced (restriction) operator on W_i.
( a ) Prove that \det(T)=\det(T_1)\cdots\det(T_k).
( b ) Prove that the characteristic polynomial for f is the product of the characteristic polynomials for f_1,\dots,f_k.
( c ) Prove that the minimal polynomial for T is the least common multiple of the minimal polynomials for T_1,\dots,T_k.
Solution: We select an ordered basis \mathscr B for each W_i, then\mathscr B_1,\dots,\mathscr B_k is a basis for V. We have

\displaystyle{A=[T]_{\mathscr B}=\begin{bmatrix}[T_1]_{\mathscr B_1}&0&\cdots&0\\0&[T_2]_{\mathscr B_2}&\cdots&0\\ \vdots&\vdots&&\vdots\\0&0&\cdots&[T_k]_{\mathscr B_k}\end{bmatrix}}

( a ) Use the formula for determinants of the block matrix we have

\displaystyle{\det(T)=\det A=\prod_{i=1}^k\det([T_i]_{\mathscr B_i})=\prod_{i=1}^k\det(T_i)}

( b ) If we let d_i=\dim W_i then

\displaystyle{xI-A=xI-[T]_{\mathscr B}=\begin{bmatrix}xI_{d_1}-[T_1]_{\mathscr B_1}&0&\cdots&0\\0&xI_{d_2}-[T_2]_{\mathscr B_2}&\cdots&0\\ \vdots&\vdots&&\vdots\\0&0&\cdots&xI_{d_k}-[T_k]_{\mathscr B_k}\end{bmatrix}}

thus the conclusion follows.
( c ) It is easy to prove that

\displaystyle{A^n=\begin{bmatrix}[T_1]^n_{\mathscr B_1}&0&\cdots&0\\0&[T_2]^n_{\mathscr B_2}&\cdots&0\\ \vdots&\vdots&&\vdots\\0&0&\cdots&[T_k]^n_{\mathscr B_k}\end{bmatrix},\quad n=1,2,\dots}

thus for any g\in F[x],

\displaystyle{g(A)=\begin{bmatrix}g([T_1]_{\mathscr B_1})&0&\cdots&0\\0&g([T_2]_{\mathscr B_2})&\cdots&0\\ \vdots&\vdots&&\vdots\\0&0&\cdots&g([T_k]_{\mathscr B_k})\end{bmatrix}}

so if g(T)=0, then g(A)=0, which means g(T_i)=0 for i=1,\dots,k.

5.Let T be the diagonalizable linear operator on R^3 which we discussed in Example 3 of Section 6.2. Use the Lagrange polynomials to write the representing matrix A in the form A=E_1+2E_2,E_1+E_2=I,E_1E_2=0.
Solution: As c_1=1,c_2=2, we have p_1=-(x-2),p_2=x-1, which means p_1(A)=E_1=2I-A,p_2(A)=E_2=A-I. Now

E_1+2E_2=2I-A+2A-2I=A\\E_1+E_2=2I-A+A-I=I\\E_1E_2=(2I-A)(A-I)=2A-2I-A^2+A=0

6.Let A be the 4\times 4 matrix in Example 5 of Section 6.3. Find matrices E_1,E_2,E_3 such that A=c_1E_1+c_2E_2+c_3E_3, E_1+E_2+E_3=I and E_iE_j=0,i\neq j.
Solution: The characteristic values of A are 0,2,-2, we let c_1=0,c_2=2,c_3=-2, then

E_1=p_1(A)=\dfrac{(A-2I)(A+2I)}{(0-2)(0+2)}=-\dfrac{1}{4}(A^2-4I)=-\dfrac{1}{4}\begin{bmatrix}-2&0&2&0\\0&-2&0&2\\2&0&-2&0\\0&2&0&-2\end{bmatrix} \\ E_2=p_2(A)=\dfrac{A(A+2I)}{(2-0)(2+2)}=\dfrac{1}{8}(A^2+2A)=\dfrac{1}{8}\begin{bmatrix}2&2&2&2\\2&2&2&2\\2&2&2&2\\2&2&2&2\end{bmatrix}\\E_3=p_3(A)=\dfrac{A(A-2I)}{(-2-0)(-2-2)}=\dfrac{1}{8}(A^2-2A)=\dfrac{1}{8}\begin{bmatrix}2&-2&2&-2\\-2&2&-2&2\\2&-2&2&-2\\-2&2&-2&2\end{bmatrix}

7.In Exercise 5 and 6, notice that (for each i) the space of characteristic vectors associated with the characteristic value c_i is spanned by the column vectors of the various matrices E_j with j\neq i. Is that a coincidence?
Solution: This may not be the case. The space of characteristic vectors associated with the characteristic value c_i is spanned by the column vectors of E_i.

8.Let T be a linear operator on V which commutes with every projection operator on V. What can you say about T?
Solution: Since for any subspace W of V we can find a W' such that V=W\oplus W', and thus projections E,E' such that \text{range }E=W, we can see any subspace of V is invariant under T.

9.Let V be the vector space of continuous real-valued functions on the interval [-1,1] of the real line. Let W_e be the subspaces of even functions, f(-x)=f(x), and let W_o be the subspace of odd functions, f(-x)=-f(x).
( a ) Show that V=W_e\oplus W_o.
( b ) If T is the indefinite integral operator (Tf)(x)=\int_0^xf(t)dt, are W_e and W_o invariant under T?
Solution:
( a ) For any f\in V, we define g_f(x)=f(-x), then f=\dfrac{f+g_f}{2}+\dfrac{f-g_f}{2}, and we have

\dfrac{f+g_f}{2}(-x)=\dfrac{f(-x)+g_f(-x)}{2}=\dfrac{g_f(x)+f(x)}{2}=\dfrac{f+g_f}{2}(x) \\ \dfrac{f-g_f}{2}(-x)=\dfrac{f(-x)-g_f(-x)}{2}=\dfrac{g_f(x)-f(x)}{2}=-\dfrac{f-g_f}{2}(x)

thus \dfrac{f+g_f}{2}\in W_e and \dfrac{f-g_f}{2}\in W_o, so V=W_e+W_o, if h\in W_e\cap W_o, then h(-x)=-h(x)=h(x), which means h(x)=0.
( b ) If f\in W_e, then

\displaystyle{(Tf)(-x)=\int_0^{-x}f(t)dt=\int_0^{x}f(-t)d(-t)=-\int_0^{x}f(t)d(t)=-(Tf)(x)}

which means Tf is an odd function. Also if f\in W_o, then

\displaystyle{(Tf)(-x)=\int_0^{-x}f(t)dt=\int_0^{x}f(-t)d(-t)=-\int_0^{x}-f(t)d(t)=(Tf)(x)}

which means Tf is an even function. So both W_e and W_o are not invariant under T.