Analysis on Manifolds 2. Matrix Inversion and Determinants

这一节继续复习线性代数知识,回顾了初等矩阵、求逆、行列式、代数余子式、Cramer法则等相关内容,总体上也易于理解。

Exercises

Exercise 1. Consider the matrix

\displaystyle{A=\begin{bmatrix}1&2\\1&-1\\0&1\end{bmatrix}.}

( a ) Find two different left inverses for A.
( b ) Show that A has no right inverse.
Solution:
(a) The two left inverses for A may be

\displaystyle{B_1=\begin{bmatrix}1&0&-2\\0&0&1\end{bmatrix},\quad B_2=\begin{bmatrix}0&1&1\\0&0&1\end{bmatrix}}

(b) Assume A has a right inverse

\displaystyle{B=\begin{bmatrix}a&b&c\\d&e&f\end{bmatrix},\quad AB=I_3}

then

\displaystyle{AB=\begin{bmatrix}1&2\\1&-1\\0&1\end{bmatrix}\begin{bmatrix}a&b&c\\d&e&f\end{bmatrix}=\begin{bmatrix}a+2d&b+2e&c+2f\\a-d&b-e&e-f\\d&e&f\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}

from the bottom row we have f=1,d=e=0, so e-f=-1, but from the second row we know e-f=0, a contradiction.

Exercise 2. Let A be an n by m matrix with n\neq m.
( a ) If \text{rank }A=m, show there exists a matrix D that is a product of elementary matrices such that

\displaystyle{D\cdot A=\begin{bmatrix}I_m\\0\end{bmatrix}.}

( b ) Show that A has a left inverse if and only if \text{rank }A=m.
( c ) Show that A has a right inverse if and only if \text{rank }A=n.
Solution:
( a ) If \text{rank }A=m, then A has at least m rows, but n\neq m, so n>m. Since we can use elementary row operations to reduce A to \begin{bmatrix}I_m\\0\end{bmatrix}, D is the product of the corresponding elementary matrices.
( b ) If \text{rank }A=m, then by (a) we can find D such that D\cdot A= \begin{bmatrix}I_m\\0\end{bmatrix}, let the m\times n matrix E be E=\begin{bmatrix}I_m&0\end{bmatrix}, then

\displaystyle{EDA=\begin{bmatrix}I_m&0\end{bmatrix}\begin{bmatrix}I_m\\0\end{bmatrix}=I_m}

thus ED is a left inverse of A. Conversely, if A has a left inverse, then there is B s.t. BA=I_m, from the proof of Theorem 2.3 we know \text{rank }A=m.
( c ) From step 3 of the proof of Theorem 2.3 we know if A has a right inverse, then \text{rank }A=n. Conversely if \text{rank }A=n, then \text{rank }A^{tr}=n, by (b), A^{tr} has a left inverse B, so BA^{tr}=I_n, thus (BA^{tr} )^{tr}=AB^{tr}=I_n.

Exercise 3. Verify that the functions defined in Example 1 satisfy the axioms for the determinant function.
Solution: For 1 by 1 matrices, we have \det [a]=a, the verification is
(1) there is no way to exchange rows.
(2) we have \det[a]=a=a\cdot 1=a \det[1].
(3) \det[1]=1.
For 2 by 2 matrices, we have \det \begin{bmatrix}a&b\\c&d\end{bmatrix}=ad-bc, the verification is
(1) \det \begin{bmatrix}c&d\\a&b\end{bmatrix}=cb-ad=-(ad-bc)=-\det \begin{bmatrix}a&b\\c&d\end{bmatrix}
(2) Let \mathrm x=(x_1,x_2 ),\mathrm{y}=(y_1,y_2 ), then

\displaystyle{A_1 (x)=\begin{bmatrix}x_1&x_2\\c&d\end{bmatrix},A_2 (x)=\begin{bmatrix}a&b\\x_1&x_2 \end{bmatrix},A_1 (y)=\begin{bmatrix}y_1&y_2\\c&d\end{bmatrix},A_2 (y)=\begin{bmatrix}a&b\\y_1&y_2 \end{bmatrix}}

so

\displaystyle{\begin{aligned}\det A_1 (m\mathrm{x}+n\mathrm{y})&=\det \begin{bmatrix}mx_1+ny_1&mx_2+ny_2\\c&d\end{bmatrix}\\&=(mx_1+ny_1 )d-(mx_2+ny_2 )c\\&=m(x_1 d-x_2 c)+n(y_1 d-y_2 c)\\&=m \det A_1 (x)+n \det A_1 (y)\end{aligned}}

Similarly \det A_2 (mx+ny)=m \det A_2 (x)+n \det A_2 (y).
(3) \det \begin{bmatrix}1&0\\0&1\end{bmatrix}=1\cdot 1-0\cdot 0=1
The verification for 3 by 3 matrices is of the same logic and omitted.

Exercise 4. ( a ) Let A be an n by n matrix of rank n. By applying elementary row operations to A, one can reduce A to the identity matrix. Show that by applying the same operations, in the same order, to I_n, one obtains the matrix A^{-1}.
( b ) Let

\displaystyle{A=\begin{bmatrix}1&2&3\\0&1&2\\1&2&1\end{bmatrix}.}

Calculate A^{-1} by using the algorithm suggested in (a).
( c ) Calculate A^{-1} using the formula involving determinants.
Solution:
( a ) We suppose, by Theorem 2.1, that there are elementary matrices E_1,\dots,E_k such that E_k{\cdots}E_1 A=I_n, let B=E_k{\cdots}E_1, then B is a left inverse for A, by Theorem 2.5, B is also a right inverse for A, thus B=A^{-1}, further we have E_k{\dots}E_1 I_n=BI_n=B=A^{-1}.
( b ) We have

\displaystyle{\begin{aligned}\begin{bmatrix}A&I_3 \end{bmatrix}&=\begin{bmatrix}1&2&3&1&0&0\\0&1&2&0&1&0\\1&2&1&0&0&1\end{bmatrix}{\rightarrow}\begin{bmatrix}1&2&3&1&0&0\\0&1&2&0&1&0\\0&0&-2&-1&0&1\end{bmatrix}\\&{\rightarrow}\begin{bmatrix}1&2&0&-1/2&0&3/2\\0&1&0&-1&1&1\\0&0&1&1/2&0&-1/2\end{bmatrix}{\rightarrow}\begin{bmatrix}1&0&0&3/2&-2&-1/2\\0&1&0&-1&1&1\\0&0&1&1/2&0&-1/2\end{bmatrix}\end{aligned}}

thus

A^{-1}=\begin{bmatrix}3/2&-2&-1/2\\-1&1&1\\1/2&0&-1/2\end{bmatrix}.

( c ) The calculation is based on Theorem 2.14, and is omitted.

Exercise 5. Let A=\begin{bmatrix}a&b\\c&d\end{bmatrix}, where ad-bc\neq 0. Find A^{-1}.
Solution: By Theorem 2.14,

\displaystyle{b_{11}=\frac{(-1)^{1+1} d}{ad-bc},b_{12}=\frac{(-1)^{1+2}b}{ad-bc},b_{21}=\frac{(-1)^{2+1}c}{ad-bc},b_{22}=\frac{(-1)^{2+2}a}{ad-bc}}

thus

\displaystyle{A^{-1}=\frac{1}{ad-bc}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}}

Exercise 6. Prove the following theorem: Let A be a k by k matrix, let D have size n by n and let C have size n by k. Then

\displaystyle{\det\begin{bmatrix}A&0\\C&D\end{bmatrix}=(\det A)\cdot(\det D).}

Solution: We have

\displaystyle{\begin{bmatrix}A&0\\0&I_n \end{bmatrix} \begin{bmatrix}I_k&0\\C&D\end{bmatrix}=\begin{bmatrix}A&0\\C&D\end{bmatrix}}

and by Lemma 2.12 we have

\displaystyle{\det \begin{bmatrix}A&0\\0&I_n \end{bmatrix}=\det A,\quad\det \begin{bmatrix}I_k&0\\C&D\end{bmatrix}=\det D}

so by Theorem 2.10 we have

\displaystyle{\det \begin{bmatrix}A&0\\C&D\end{bmatrix}=(\det A)\cdot(\det D)}

Analysis on Manifolds 1.Review of Linear Algebra

本节是本书的第一章第一节,主要回顾一些线性代数的基础知识,在流形分析中用到的线性代数知识是非常基本而经典的,比如线性空间、内积、矩阵、线性变换、矩阵的秩、转置。
本节的六个定理,Theorem 1.1 说明空间维数和基向量个数的关系,Theorem 1.2说明任何低于空间维数的线性无关向量组都可以扩充为一组基,Theorem 1.3说明矩阵乘积的上限模(sup norm)不大于两个矩阵上限模的乘积再乘以二者的公共维度,是一个卡上界的定理,在后面关于\mathbf R^n上微积分的讨论中比较有用。Theorem 1.4说明了线性变换对目标空间的一种“遍历性”,可以变换至任意的向量组。Theorem 1.5是矩阵的行秩和列秩相等。Theorem 1.6说明初等行变换不影响矩阵的秩。

Exercises

Exercise 1. Let V be a vector space with inner product \langle x,y\rangle and norm | x |=\langle x,x\rangle.
( a ) Prove the Cauchy-Schwarz inequality \langle x,y\rangle\leq| x | | y |.
( b ) Prove that |x+y|\leq|x|+|y|.
( c ) Prove that |x-y|\geq|x|-|y|.

Solution:
( a ) If x=0 or y=0 the inequality holds. Suppose x,y\neq 0 and let c=1/|x| and d=1/|y|, since | cx-dy|\geq 0, we expand it to get

\displaystyle{c^2|x|^2-2cd\langle x,y\rangle+d^2|y|^2 \geq 0 \implies cd\langle x,y\rangle\leq 1}

( b ) We have

\displaystyle{ \begin{aligned}|x+y|^2&=\langle x+y,x+y\rangle =|x|^2+2\langle x,y\rangle +|y|^2 \\&\leq |x|^2+2|x||y| +|y|^2=(|x|+|y|)^2\end{aligned}}

( c ) We have

\displaystyle{ \begin{aligned}|x-y|^2&=\langle x-y,x-y\rangle =|x|^2-2\langle x,y\rangle +|y|^2 \\&\geq |x|^2-2|x||y| +|y|^2=(|x|-|y|)^2\end{aligned}}

Exercise 2. If A is an n by m matrix and B is an m by p matrix, show that

\displaystyle{|A\cdot B|\leq m|A| |B|}

Solution: We denote A=(a_{ij}) and B=(b_{jk}), where 1\leq i\leq n,1\leq j\leq m and 1\leq k\leq p. Obciously a_{ij}\leq |A| and b_{jk}\leq |B|. Now let c_{ik} be the ik-th element of the product A\cdot B, we have

\displaystyle{c_{ik}=\sum_{j=1}^ma_{ij}b_{jk}\leq m|A||B|\implies |A\cdot B|\leq m|A| |B|}

Exercise 3. Show that the sup norm on \mathbf R^2 is not derived from an inner product on \mathbf R^2.
Solution: Assume |x|=\max{|x_1|,|x_2|} is derived from some specific inner product I(x,y) on \mathbf R^2, which means I(x,x)=|x|^2. we can have

\displaystyle{I(a,b)=\dfrac{|a+b|^2-|a-b|^2}{4},\quad\forall a,b\in\mathbf R^2}

Let a=c=(2,1),b=(0,-4), we shall have

\displaystyle{I(a,b)=I(c,b)=\frac{9-25}{4}=-4,\quad I(a+c,b)=\frac{16-36}{4}=-5}

But the property (2) of inner product says I(a,b)+I(c,b)=I(a+c,b), a contradiction.

Exercise 4. (a) If \mathbf x=(x_1,x_2) and \mathbf y=(y_1,y_2), show that the function

\displaystyle{\langle x,y\rangle=\begin{bmatrix}x_1&x_2\end{bmatrix}\begin{bmatrix}2&-1\\-1&1\end{bmatrix}\begin{bmatrix}y_1\\y_2\end{bmatrix}}

is an inner product on \mathbf R^2.
(b) Show that the function

\displaystyle{\langle x,y\rangle=\begin{bmatrix}x_1&x_2\end{bmatrix}\begin{bmatrix}a&b\\b&c\end{bmatrix}\begin{bmatrix}y_1\\y_2\end{bmatrix}}

is an inner product on \mathbf R^2 if and only if b^2-ac<0 and a>0.

Solution:
( a ) By matrix multiplication we have

\displaystyle{\langle x,y\rangle=2x_1y_1-x_2y_1-x_1y_2+x_2y_2}

All four conditions hold by direct checks.
( b ) By matrix multiplication we have

\displaystyle{\langle x,y\rangle=ax_1y_1+b(x_2y_1+x_1y_2)+cx_2y_2}

Among the four conditions of inner products, (1) and (3) are satisfied for whatever the values of a,b,c are. For (2) we also have this result since

\displaystyle{\begin{aligned}\langle x+y,z\rangle&=a(x_1+y_1)z_1+b(x_2z_1+y_2z_1+x_1z_2+y_1z_2)+c(x_2+y_2)z_2\\&=[ax_1z_1+b(x_2z_1+x_1z_2)+cx_2z_2]+[ay_1z_1+b(y_2z_1+y_1z_2)+cy_2z_2]\\&=\langle x,z\rangle+\langle y,z\rangle\end{aligned}}

For the last condition, as

\displaystyle{\langle x,x\rangle=ax_1^2+2bx_1x_2+cx_2^2=a\left(x_1+\frac{b}{a}x_2\right)^2+\left(c-\frac{b^2}{a}\right)x_2^2}

This expression is positive for all (x_1,x_2) if and only if a>0 and ac-b^2>0.

Exercise 5. Let V be a vector space; let \{\mathrm a_{\alpha}\} be a set of vectors of V, as \alpha ranges over some index set J (which may be infinite). We say that the set \{\mathrm a_{\alpha}\} spans V if every vector \mathbf x in V can be written as a finite linear combination

\displaystyle{\mathbf x=c_{\alpha_1}{\mathrm a}_{\alpha_1}+\cdots+c_{\alpha_k}{\mathrm a}_{\alpha_k}}

of vectors from this set. The set {\mathrm a_{\alpha}} is independent if the scalars are uniquely determined by \mathbf x. The set {\mathrm a_{\alpha}} is a basis for V if it both spans V and is independent.
( a ) Check that the set \mathbf R^{\omega} of all “infinite-tuples” of real numbers \mathbf x=(x_1,x_2,\dots) is a vector space under component-wise addition and scalar multiplication.
( b ) Let \mathbf R^{\infty} denote the subset of \mathbf R^{\omega} consisting of all \mathbf x=(x_1,x_2,\dots) such that x_i=0 for all but finitely many values of i. Show that \mathbf R^{\infty} is a subspace of \mathbf R^{\omega}; find a basis for \mathbf R^{\infty}.
( c ) Let \mathcal F be the set of all real-valued functions f:[a,b]\to\mathbf R. Show that \mathcal F is a vector space if addition and scalar multiplication are defined in the natural way:

\displaystyle{(f+g)(x)=f(x)+g(x) \\ (cf)(x)=cf(x)}

( d ) Let \mathcal F_B be the subset of \mathcal F consisting of all bounded functions. Let \mathcal F_I consist of all integrable functions. Let \mathcal F_C consist of all continuous functions. Let \mathcal F_D consist of all continuously differentiable functions. Let \mathcal F_P consist of all polynomial functions. Show that each of these is a subspace of the preceding one, and find a basis for \mathcal F_P.
( e ) Show tha the integral operator and the differentiation operator,

\displaystyle{(If)(x)=\int_a^xf(t)dt\quad\text{and}\quad(Df)(x)=f'(x),}

are linear transformations. What are possible domains and ranges of these transformations, among those listed in (d)?

Solution:
( a ) For \mathbf{x},\mathbf{y}\in\mathbf R^{\omega}, we have

\displaystyle{c\mathbf{x}+\mathbf{y}=(cx_1+y_1,cx_2+y_2,\dots),c\in\mathbf R}

and the result follows as \mathbf R is closed under addition and scalar multiplication.
( b ) That \mathbf R^{\infty} is a subspace of \mathbf R^{\omega} is easy to verify. A basis for \mathbf R^{\infty} may be {e_i}, where e_i denotes the vector which has only 1 in the ith component and 0 otherwise.
( c ) For f,g\in\mathcal F, we have (cf+g)(x)=cf(x)+g(x)\in\mathbf R, thus cf+g\in\mathcal F.
( d ) The fact that each of these is a subspace of the preceding one can be easily verified. A basis for \mathcal F_P can be f_i=x^i for i=0,1,2,\dots.
( e ) As

I(cf+g)(x)=\int_a^x[cf(t)+g(t)]dt=c\int_a^xf(t)dt+\int_a^xg(t)dt=c(If)(x)+(Ig)(x) \\D(cf+g)(x)=(cf+g)'(x)=cf'(x)+g'(x)=c(Df)(x)+(Dg)(x)

Both operator are linear transformations.
Next we denote \mathscr D to be the domain of an operator and \mathscr R the range of an operator.
In \mathcal F_P, \mathscr D_I=\mathcal F_P and \mathscr R_I is the set of all polynomial functions with degree \geq 1. \mathscr D_D=\mathscr R_D=\mathcal F_P.
In \mathcal F_D, \mathscr D_I=\mathscr R_I=\mathcal F_D, while \mathscr D_D=\mathcal F_D and \mathscr R_D=\mathcal F_C.
In \mathcal F_C, \mathscr D_I=\mathscr R_I=\mathcal F_C, while \mathscr D_D=\mathcal F_D and \mathscr R_D=\mathcal F_C.
In \mathcal F_I, \mathscr D_I=\mathcal F_I, \mathscr R_I=\mathcal F_C, while \mathscr D_D=\mathcal F_D and \mathscr R_D=\mathcal F_C.
In \mathcal F_B, \mathscr D_I=\mathcal F_I, \mathscr R_I=\mathcal F_C, while \mathscr D_D=\mathcal F_D and \mathscr R_D=\mathcal F_C.

Linear Algebra (2ed) Hoffman & Kunze 后记

在我求学期间,高等代数和解析几何是相对分析学得比较好的一门课,但是这个比较好使得我在重学其他教材时产生了很大的困难。具体原因,当年用的北大王萼芳那本高等代数是标准的苏式教材,第一章就是行列式,且是用permutation和余子式计算行列式,我记得入学第一个月就懵圈许久,直到线性相关、线性无关时才反映过来。所以我对线性相关和向量空间没有什么困难,但是一直局限在解方程组的维度上,对矩阵的运算很熟,但完全无法理解矩阵、向量、空间等等的关系,后面的一些更抽象的定理,诸如spectral theorem等,则完全没有接触过。最近听说清华本科已经将线性代数教材改为英文版,我非常赞成。

后来在求学期间的末尾,试读了Curtis的Linear Algebra: An Introductory Approach。对线性代数的理解有所提升,因为忙于毕业,也没有读完。

2014年,我有机会接触了Sheldon Axler的Linear Algebra Done Right,这是在国内比较出名的一本小书,很薄,但我第一次读时颇有一种大开眼界之感,但当年读到特征值、特征多项式时就有些吃力,后面几章其实隶属于国内近世代数的内容(除了最后一章行列式,Axler一直比较自豪于把Determinant放到最后的处理,有兴趣的可以去查查他的官网,当然这种处理是有争议的,学线性代数想绕开行列式基本不可能)。2018年我又系统性的读了一次Linear Algebra Done Right,基本理清了这门学科的一些脉络,当时也投入了比较大的精力,读后的感觉是抽象化的建立了线性代数的学科内容,但是对一些比较难的概念还是不清楚为什么会设置,例如adjoint operator。

Hoffman的这一本书是MIT的教材,MIT的线性代数公开课久负盛名,这本书虽然成书于上世纪70年代,但是一直被奉为经典,北大近世代数选修课也用这本书的第6章之后的一些内容作为教材。这本书纸版我入手的很早(2011年),听说现在已经买不到一手书了,所以从去年开始抽空开始读,断断续续大致花了一年时间。总体的感觉,这本书的优缺点都很明显,优点是视点很高,对以后学泛函等很有帮助,内容中规中矩,该有的都有,不该有的也有(例如dual space,double dual,我在其他任何一本书里都没见过)。缺点是因为成书过早,符号体系很老,有点不适应,此外没有包括一些最新的进展(例如singular value decomposition,在Axler的书里提了),最后就是感觉从第8章开始有点混乱,一些概念例如positive operator是第9章定义,但第8章习题里已经有了,后面三章的笔误和错误习题也比较多。所以我舍弃了后面两章的内容。

这本书前八章质量仍是很高的,适应了符号体系后,有种大刀阔斧的感觉,但本书绝对不适用于第一次学线性代数的人,可能会完全云里雾里,本科高年级和研究生比较适合。学完这本书后的进阶教材是Roman的Advanced Linear Algebra,注重理论研究的则可以读Artin的Algebra。