Section 8.2 Analysis with vector spaces
Note: 3 lectures
Subsection 8.2.1 Norms
Let us start measuring the size of vectors and hence distance.
Definition 8.2.1.
If \(X\) is a vector space, then we say a function \(\snorm{\cdot} \colon X \to \R\) is a norm if
\(\snorm{x} \geq 0\text{,}\) with \(\snorm{x}=0\) if and only if \(x=0\text{.}\)
\(\snorm{cx} = \sabs{c} \, \snorm{x}\) for all \(c \in \R\) and \(x \in X\text{.}\)
\(\snorm{x+y} \leq \snorm{x}+\snorm{y}\) for all \(x,y \in X\text{.}\) (triangle inequality)
A vector space equipped with a norm is called a normed vector space.
Given a norm (any norm) on a vector space \(X\text{,}\) we define a distance \(d(x,y) := \snorm{xy}\text{,}\) and this \(d\) makes \(X\) into a metric space (exercise). So everything you know about metric spaces applies to normed vector spaces.
Before defining the standard norm on \(\R^n\text{,}\) let us define the standard scalar dot product on \(\R^n\text{.}\) For vectors \(x=(x_1,x_2,\ldots,x_n) \in \R^n\) and \(y=(y_1,y_2,\ldots,y_n) \in \R^n\) define
The dot product is linear in each variable separately, or in more fancy language it is bilinear. That is, if \(y\) is fixed, the map \(x \mapsto x \cdot y\) is a linear map from \(\R^n\) to \(\R\text{.}\) Similarly, if \(x\) is fixed, then \(y \mapsto x \cdot y\) is linear. It is also symmetric in the sense that \(x \cdot y = y \cdot x\text{.}\) The euclidean norm is defined as
We normally just use \(\snorm{x}\text{,}\) only in the rare instance when it is necessary to emphasize that we are talking about the euclidean norm will we use \(\snorm{x}_{\R^n}\text{.}\) It is easy to see that the euclidean norm satisfies i and ii. To prove that iii holds, the key inequality is the socalled Cauchy–Schwarz inequality we saw before. As this inequality is so important, we restate and reprove a slightly stronger version using the notation of this chapter.
Theorem 8.2.2. Cauchy–Schwarz inequality.
Let \(x, y \in \R^n\text{,}\) then
with equality if and only if \(x = \lambda y\) or \(y = \lambda x\) for some \(\lambda \in \R\text{.}\)
Proof.
If \(x=0\) or \(y = 0\text{,}\) then the theorem holds trivially. So assume \(x\not= 0\) and \(y \not= 0\text{.}\)
If \(x\) is a scalar multiple of \(y\text{,}\) that is \(x = \lambda y\) for some \(\lambda \in \R\text{,}\) then the theorem holds with equality:
Fixing \(x\) and \(y\text{,}\) as a function of \(t\text{,}\) \(\snorm{x+ty}^2\) is a quadratic polynomial:
If \(x\) is not a scalar multiple of \(y\text{,}\) then \(\snorm{x+ty}^2 > 0\) for all \(t\text{.}\) So the polynomial \(\snorm{x+ty}^2\) is never zero. Elementary algebra says that the discriminant must be negative:
or in other words, \({(x \cdot y)}^2 < \snorm{x}^2\snorm{y}^2\text{.}\)
Item iii, the triangle inequality in \(\R^n\text{,}\) follows from:
The distance \(d(x,y) := \snorm{xy}\) is the standard distance (standard metric) on \(\R^n\) that we used when we talked about metric spaces.
Definition 8.2.3.
Let \(A \in L(X,Y)\text{.}\) Define
The number \(\snorm{A}\) (possibly \(\infty\)) is called the operator norm. We will see below that it is indeed a norm for finitedimensional spaces. Again, when necessary to emphasize which norm we are talking about, we may write it as \(\snorm{A}_{L(X,Y)}\text{.}\)
For example, if \(X=\R^1\) with norm \(\snorm{x}=\sabs{x}\text{,}\) we think of elements of \(L(X)\) as multiplication by scalars: \(x \mapsto ax\text{.}\) If \(\snorm{x}=\sabs{x}=1\text{,}\) then \(\sabs{a x} = \sabs{a}\text{,}\) so the operator norm of \(a\) is \(\sabs{a}\text{.}\)
By linearity, \(\norm{A \frac{x}{\snorm{x}}} = \frac{\snorm{Ax}}{\snorm{x}}\) for all nonzero \(x \in X\text{.}\) The vector \(\frac{x}{\snorm{x}}\) is of norm 1. Therefore,
This implies, assuming \(\snorm{A}\) is not infinity, that for every \(x \in X\text{,}\)
We will use this inequality a lot. It is not hard to see from the definition that \(\snorm{A} = 0\) if and only if \(A = 0\text{,}\) where by \(A=0\) we mean that \(A\) takes every vector to the zero vector. It is also not difficult to compute the operator norm of the identity operator:
The operator norm is not always a norm on \(L(X,Y)\text{,}\) in particular, \(\snorm{A}\) is not always finite for \(A \in L(X,Y)\text{.}\) We prove below that \(\snorm{A}\) is finite when \(X\) is finitedimensional. The operator norm being finite is equivalent to \(A\) being continuous. For infinitedimensional spaces, neither statement needs to be true. For an example, consider the vector space of continuously differentiable functions on \([0,2\pi]\) using the uniform norm. The functions \(t \mapsto \sin(n t)\) have norm 1, but their derivatives have norm \(n\text{.}\) So differentiation, which is a linear operator valued in the space of continuous functions, has infinite operator norm on this space. We will stick to finitedimensional spaces.
When we talk about a finitedimensional vector space \(X\text{,}\) we often think of \(\R^n\text{,}\) although if we have a norm on \(X\text{,}\) the norm might not be the standard euclidean norm. In the exercises, you can prove that every norm is “equivalent” to the euclidean norm in that the topology it generates is the same. For simplicity, we only prove the following proposition for the euclidean space, and the proof for a general finitedimensional space is left as an exercise.
Proposition 8.2.4.
Let \(X\) and \(Y\) be normed vector spaces and \(A \in L(X,Y)\text{.}\) Suppose that \(X\) is finitedimensional. Then \(\snorm{A} < \infty\text{,}\) and \(A\) is uniformly continuous (Lipschitz with constant \(\snorm{A}\)).
Proof.
As we said we only prove the proposition for euclidean space, so suppose that \(X = \R^n\) and the norm is the standard euclidean norm. The general case is left as an exercise.
Let \(\{ e_1,e_2,\ldots,e_n \}\) be the standard basis of \(\R^n\text{.}\) Write \(x \in \R^n\text{,}\) with \(\snorm{x} = 1\text{,}\) as
Since \(e_j \cdot e_\ell = 0\) whenever \(j\not=\ell\) and \(e_j \cdot e_j = 1\text{,}\) then \(c_j = x \cdot e_j\) and by Cauchy–Schwarz
Then
The righthand side does not depend on \(x\text{.}\) We found a finite upper bound for \(\snorm{Ax}\) independent of \(x\text{,}\) so \(\snorm{A} < \infty\text{.}\)
For any normed vector spaces \(X\) and \(Y\text{,}\) and \(A \in L(X,Y)\text{,}\) suppose that \(\snorm{A} < \infty\text{.}\) For \(v,w \in X\text{,}\)
As \(\snorm{A} < \infty\text{,}\) then this says \(A\) is Lipschitz with constant \(\snorm{A}\text{.}\)
Proposition 8.2.5.
Let \(X\text{,}\) \(Y\text{,}\) and \(Z\) be finitedimensional normed vector spaces^{ 1 }.

If \(A,B \in L(X,Y)\) and \(c \in \R\text{,}\) then
\begin{equation*} \snorm{A+B} \leq \snorm{A}+\snorm{B}, \qquad \snorm{cA} = \sabs{c} \, \snorm{A} . \end{equation*}In particular, the operator norm is a norm on the vector space \(L(X,Y)\text{.}\)

If \(A \in L(X,Y)\) and \(B \in L(Y,Z)\text{,}\) then
\begin{equation*} \snorm{BA} \leq \snorm{B} \, \snorm{A} . \end{equation*}
Proof.
First, since all the spaces are finitedimensional, then all the operator norms are finite, and the statements make sense to begin with.
For i,
So \(\snorm{A+B} \leq \snorm{A}+\snorm{B}\text{.}\)
Similarly,
Thus \(\snorm{cA} \leq \sabs{c} \, \snorm{A}\text{.}\) Next,
Hence \(\sabs{c} \, \snorm{A} \leq \snorm{cA}\text{.}\)
For ii write
As a norm defines a metric, there is a metric space topology on \(L(X,Y)\) for finitedimensional vector spaces, so we can talk about open/closed sets, continuity, and convergence.
Proposition 8.2.6.
Let \(X\) be a finitedimensional normed vector space. Let \(GL(X) \subset L(X)\) be the set of invertible linear operators.^{ 2 }

If \(A \in GL(X)\text{,}\) \(B \in L(X)\text{,}\) and
\begin{equation} \snorm{AB} < \frac{1}{\snorm{A^{1}}},\tag{8.2} \end{equation}then \(B\) is invertible.
\(GL(X)\) is an open subset, and \(A \mapsto A^{1}\) is a continuous function on \(GL(X)\text{.}\)
Let us make sense of this proposition on a simple example. Consider \(X=\R^1\text{,}\) where linear operators are just numbers \(a\) and the operator norm of \(a\) is \(\sabs{a}\text{.}\) The operator \(a\) is invertible (\(a^{1} = \nicefrac{1}{a}\)) whenever \(a \not=0\text{.}\) The condition \(\sabs{ab} < \frac{1}{\sabs{a^{1}}}\) does indeed imply that \(b\) is not zero. And \(a \mapsto \nicefrac{1}{a}\) is a continuous map. When the dimension is bigger than \(1\text{,}\) there are other noninvertible operators than just zero, and in general things are a bit more difficult.
Proof.
Let us prove i. We know something about \(A^{1}\) and \(AB\text{;}\) they are linear operators. So apply them to a vector:
Therefore,
Assume \(x \neq 0\) and so \(\snorm{x} \neq 0\text{.}\) Using (8.2), we obtain
In other words, \(\snorm{Bx} \not= 0\) for all nonzero \(x\text{,}\) and hence \(Bx \not= 0\) for all nonzero \(x\text{.}\) This is enough to see that \(B\) is onetoone (if \(Bx = By\text{,}\) then \(B(xy) = 0\text{,}\) so \(x=y\)). As \(B\) is onetoone operator from \(X\) to \(X\text{,}\) which is finitedimensional and hence also onto by Proposition 8.1.18, \(B\) is invertible.
Let us prove ii. Item i immediately implies that \(GL(X)\) is open. Let us show that the inverse is continuous. Fix some \(A \in GL(X)\text{.}\) Let \(B\) be near \(A\text{,}\) specifically \(\snorm{AB} < \frac{1}{2 \snorm{A^{1}}}\text{.}\) Then (8.2) is satisfied and \(B\) is invertible. A similar computation as above (using \(B^{1}y\) instead of \(x\)) gives
or
So \(\snorm{B^{1}} \leq 2 \snorm{A^{1}} \text{.}\)
Now
and
Therefore, as \(B\) tends to \(A\text{,}\) \(\snorm{B^{1}A^{1}}\) tends to 0, and so the inverse operation is a continuous function at \(A\text{.}\)
Subsection 8.2.2 Matrices
Once we fix a basis in a finitedimensional vector space \(X\text{,}\) we can represent a vector of \(X\) as an \(n\)tuple of numbers—a vector in \(\R^n\text{.}\) Same can be done with \(L(X,Y)\text{,}\) bringing us to matrices, which are a convenient way to represent finitedimensional linear transformations. Suppose \(\{ x_1, x_2, \ldots, x_n \}\) and \(\{ y_1, y_2, \ldots, y_m \}\) are bases for vector spaces \(X\) and \(Y\) respectively. A linear operator is determined by its values on the basis. Given \(A \in L(X,Y)\text{,}\) \(A x_j\) is an element of \(Y\text{.}\) Define the numbers \(a_{i,j}\) as follows
and write them as a matrix
We sometimes write \(A\) as \([a_{i,j}]\text{.}\) We say \(A\) is an \(m\)by\(n\) matrix. The \(j\)th column of the matrix gives precisely the coefficients that represent \(A x_j\) in terms of the basis \(\{ y_1,y_2,\ldots,y_m \}\text{.}\) If we know the numbers \(a_{i,j}\text{,}\) then via the formula (8.3) we find the corresponding linear operator, as it is determined by the action on a basis. Hence, once we fix a basis on \(X\) and on \(Y\text{,}\) we have a onetoone correspondence between \(L(X,Y)\) and the \(m\)by\(n\) matrices.
When
then
which gives rise to the familiar rule for matrix multiplication.
More generally, if \(B\) is an \(n\)by\(r\) matrix with entries \(b_{j,k}\text{,}\) then the matrix for \(C = AB\) is an \(m\)by\(r\) matrix whose \((i,k)\)th entry \(c_{i,k}\) is
A way to remember it is if you order the indices as we do, that is row, column, and put the elements in the same order as the matrices, then it is the “middle index” that is “summedout.”
There is a onetoone correspondence between matrices and linear operators in \(L(X,Y)\text{,}\) once we fix a basis in \(X\) and in \(Y\text{.}\) If we choose a different basis, we get different matrices. This is an important distinction, the operator \(A\) acts on elements of \(X\text{,}\) the matrix is something that works with \(n\)tuples of numbers, that is, vectors of \(\R^n\text{.}\) By convention, we use standard bases in \(\R^n\) unless otherwise specified, and we identify \(L(\R^n,\R^m)\) with the set of \(m\)by\(n\) matrices.
A linear mapping changing one basis to another is represented by a square matrix in which the columns represent vectors of the second basis in terms of the first basis. We call such a linear mapping a change of basis. So for two choices of a basis in an \(n\)dimensional vector space, there is a linear mapping (a change of basis) taking one basis to the other, and this corresponds to an \(n\)by\(n\) matrix which does the corresponding operation on \(\R^n\text{.}\)
Suppose \(X=\R^n\text{,}\) \(Y=\R^m\text{,}\) and all the bases are just the standard bases. Using the Cauchy–Schwarz inequality compute
In other words, we have a bound on the operator norm (note that equality rarely happens)
The right hand side is the euclidean norm on \(\R^{nm}\text{,}\) the space of all the entries of the matrix. If the entries go to zero, then \(\snorm{A}\) goes to zero. Conversely,
So if the operator norm of \(A\) goes to zero, so do the entries. In particular, if \(A\) is fixed and \(B\) is changing, then the entries of \(B\) go to the entries of \(A\) if and only if \(B\) goes to \(A\) in operator norm (\(\snorm{AB}\) goes to zero). We have proved:
Proposition 8.2.7.
The topology (the set of open sets) on \(L(\R^n,\R^m)\) is the same whether we consider \(L(\R^n,\R^m)\) as a metric space using the operator norm, or with the euclidean metric of \(\R^{nm}\text{.}\)
In particular, let \(S\) be a metric space and let \(\pi \colon L(\R^n,\R^m) \to \R^{nm}\) identify an operator with the \(nm\)tuple of entries of the corresponding matrix. Then \(f \colon S \to L(\R^n,\R^m)\) is continuous if and only if \(\pi \circ f \colon S \to \R^{nm}\) is continuous. Similarly for \(g \colon L(\R^n,\R^m) \to S\) and \(g \circ \pi^{1} \colon \R^{nm} \to S\text{.}\)
Subsection 8.2.3 Determinants
A certain number can be assigned to square matrices that measures how the corresponding linear mapping stretches space. In particular, this number, called the determinant, can be used to test for invertibility of a matrix.
Define the symbol \(\operatorname{sgn}(x)\) (read “sign of \(x\)”) for a number \(x\) by
Suppose \(\sigma = (\sigma_1,\sigma_2,\ldots,\sigma_n)\) is a permutation of the integers \((1,2,\ldots,n)\text{,}\) that is, a reordering of \((1,2,\ldots,n)\text{.}\) Define
Here \(\prod\) stands for multiplication, similarly to how \(\sum\) stands for summation.
Any permutation can be obtained by a sequence of transpositions (switchings of two elements). A permutation is even (resp. odd) if it takes an even (resp. odd) number of transpositions to get from \((1,2,\ldots,n)\) to \(\sigma\text{.}\) For instance, \((2,4,3,1)\) is two transpositions away from \((1,2,3,4)\) and is therefore even: \((1,2,3,4) \to (2,1,3,4) \to (2,4,3,1)\text{.}\) Being even or odd is welldefined: \(\operatorname{sgn}(\sigma)\) is \(1\) if \(\sigma\) is even and \(1\) if \(\sigma\) is odd (exercise). This fact can be proved by noting that applying a transposition changes the sign, and computing that \(\operatorname{sgn}(1,2,\ldots,n) = 1\text{.}\)
Let \(S_n\) be the set of all permutations on \(n\) elements (the symmetric group). Let \(A= [a_{i,j}]\) be a square \(n\)by\(n\) matrix. Define the determinant of \(A\) as
Proposition 8.2.8.
\(\det(I) = 1\text{.}\)
For every \(j=1,2,\ldots,n\text{,}\) the function \(x_j \mapsto \det\bigl([x_1 ~~~ x_2 ~~~ \cdots ~~~ x_n ]\bigr)\) is linear.
If two columns of a matrix are interchanged, then the determinant changes sign.
If two columns of \(A\) are equal, then \(\det(A) = 0\text{.}\)
If a column is zero, then \(\det(A) = 0\text{.}\)
\(A \mapsto \det(A)\) is a continuous function on \(L(\R^n)\text{.}\)
\(\det\left( \left[\begin{smallmatrix} a & b \\ c &d \end{smallmatrix}\right] \right) = adbc\text{,}\) and \(\det \bigl( [a] \bigr) = a\text{.}\)
In fact, the determinant is the unique function that satisfies i, ii, and iii. But we digress. By ii, we mean that if we fix all the vectors \(x_1,\ldots,x_n\) except for \(x_j\text{,}\) and let \(v,w \in \R^n\) be two vectors, and \(a,b \in \R\) be scalars, then
Proof.
We go through the proof quickly, as you have likely seen it before. Item i is trivial. For ii, notice that each term in the definition of the determinant contains exactly one factor from each column. Item iii follows by noting that switching two columns is like switching the two corresponding numbers in every element in \(S_n\text{.}\) Hence, all the signs are changed. Item iv follows because if two columns are equal, and we switch them, we get the same matrix back. So item iii says the determinant must be 0. Item v follows because the product in each term in the definition includes one element from the zero column. Item vi follows as \(\det\) is a polynomial in the entries of the matrix and hence continuous (as a function of the entries of the matrix). Two matrices are A function defined on matrices is continuous in the operator norm if and only if it is continuous as a function of the entries (Proposition 8.2.7). Finally, item vii is a direct computation.
The determinant tells us about areas and volumes, and how they change. For example, in the \(1\)by\(1\) case, a matrix is just a number, and the determinant is exactly this number. It says how the linear mapping “stretches” the space. Similarly for \(\R^2\text{.}\) Suppose \(A \in L(\R^2)\) is a linear transformation. It can be checked directly that the area of the image of the unit square \(A\bigl([0,1]^2\bigr)\) is precisely \(\sabs{\det(A)}\text{.}\) This works with arbitrary figures, not just the unit square: The absolute value of the determinant tells us the stretch in the area. The sign of the determinant tells us if the image is flipped (changes orientation) or not. In \(\R^3\) it tells us about the 3dimensional volume, and in \(n\) dimensions about the \(n\)dimensional volume. We claim this without proof.
Proposition 8.2.9.
If \(A\) and \(B\) are \(n\)by\(n\) matrices, then \(\det(AB) = \det(A)\det(B)\text{.}\) Furthermore, \(A\) is invertible if and only if \(\det(A) \not= 0\) and in this case, \(\det(A^{1}) = \frac{1}{\det(A)}\text{.}\)
Proof.
Let \(b_1,b_2,\ldots,b_n\) be the columns of \(B\text{.}\) Then
That is, the columns of \(AB\) are \(Ab_1,Ab_2,\ldots,Ab_n\text{.}\)
Let \(b_{j,k}\) denote the elements of \(B\) and \(a_j\) the columns of \(A\text{.}\) By linearity of the determinant,
In the last equality, we can sum over just the elements of \(S_n\) instead of all \(n\)tuples for integers between 1 and \(n\) by noting that when two columns in the determinant are the same, then the determinant is zero. Then we reordered the columns to the original ordering to obtain the sgn.
The conclusion that \(\det(AB) = \det(A)\det(B)\) follows by recognizing that the expression in parentheses above is the determinant of \(B\text{.}\) We obtain this by plugging in \(A=I\text{.}\) The expression we get for the determinant of \(B\) has rows and columns swapped, so as a side note, we have also just proved that the determinant of a matrix and its transpose are equal.
Let us prove the “Furthermore” part. If \(A\) is invertible, then \(A^{1}A = I\) and consequently \(\det(A^{1})\det(A) = \det(A^{1}A) = \det(I) = 1\text{.}\) If \(A\) is not invertible, then there must be a nonzero vector that \(A\) takes to zero as \(A\) is not onetoone. In other words, the columns of \(A\) are linearly dependent. Suppose
where not all \(\gamma_j\) are equal to 0. Without loss of generality suppose \(\gamma_1\neq 0\text{.}\) Take
Using the definition of the determinant (there is only a single permutation \(\sigma\) for which \(\prod_{i=1}^n b_{i,\sigma_i}\) is nonzero) we find \(\det(B) = \gamma_1 \not= 0\text{.}\) Then \(\det(AB) = \det(A)\det(B) = \gamma_1\det(A)\text{.}\) The first column of \(AB\) is zero, and hence \(\det(AB) = 0\text{.}\) We conclude \(\det(A) = 0\text{.}\)
Proposition 8.2.10.
Determinant is independent of the basis. In other words, if \(B\) is invertible, then
The proof is to compute \(\det(B^{1}AB) = \det(B^{1})\det(A)\det(B) = \frac{1}{\det(B)}\det(A)\det(B) = \det(A)\text{.}\)
If in one basis \(A\) is the matrix representing a linear operator, then for another basis we can find a matrix \(B\) such that the matrix \(B^{1}AB\) takes us to the first basis, applies \(A\) in the first basis, and takes us back to the basis we started with. Let \(X\) be a finitedimensional vector space. Let \(\Phi \in L(X,\R^n)\) take a basis \(\{ x_1,\ldots,x_n \}\) to the standard basis \(\{ e_1,\ldots,e_n \}\) and let \(\Psi \in L(X,\R^n)\) take another basis \(\{ y_1, \ldots, y_n \}\) to the standard basis. Let \(T \in L(X)\) be a linear operator and let a matrix \(A\) represent the operator in the basis \(\{ x_1,\ldots,x_n \}\text{.}\) Then \(B\) would be such that we have the following diagram^{ 3 }: The two \(\R^n\)s on the bottom row represent \(X\) in the first basis, and the \(\R^n\)s on top represent \(X\) in the second basis.
If we compute the determinant of the matrix \(A\text{,}\) we obtain the same determinant if we use any other basis; in the other basis the matrix would be \(B^{1}AB\text{.}\) Consequently,
is a welldefined function without the need to fix a basis. That is, \(\det\) is defined on \(L(X)\text{,}\) not just on matrices.
There are three types of socalled elementary matrices. Let \(e_1,e_2,\ldots,e_n\) be the standard basis on \(\R^n\) as usual.
First, for \(j = 1,2,\ldots,n\) and \(\lambda \in \R\text{,}\) \(\lambda \neq 0\text{,}\) define the first type of an elementary matrix, an \(n\)by\(n\) matrix \(E\) by
Given any \(n\)by\(m\) matrix \(M\) the matrix \(EM\) is the same matrix as \(M\) except with the \(j\)th row multiplied by \(\lambda\text{.}\) It is an easy computation (exercise) that \(\det(E) = \lambda\text{.}\)
Next, for \(j\) and \(k\) with \(j\neq k\text{,}\) and \(\lambda \in \R\text{,}\) define the second type of an elementary matrix \(E\) by
Given any \(n\)by\(m\) matrix \(M\) the matrix \(EM\) is the same matrix as \(M\) except with \(\lambda\) times the \(k\)th row added to the \(j\)th row. It is an easy computation (exercise) that \(\det(E) = 1\text{.}\)
Finally, for \(j\) and \(k\) with \(j\neq k\text{,}\) define the third type of an elementary matrix \(E\) by
Given any \(n\)by\(m\) matrix \(M\) the matrix \(EM\) is the same matrix with \(j\)th and \(k\)th rows swapped. It is an easy computation (exercise) that \(\det(E) = 1\text{.}\)
Proposition 8.2.11.
Let \(T\) be an \(n\)by\(n\) invertible matrix. Then there exists a finite sequence of elementary matrices \(E_1, E_2, \ldots, E_k\) such that
and
The proof is left as an exercise. The proposition says that we can compute the determinant by doing elementary row operations. For computing the determinant, one does not have to factor the matrix into a product of elementary matrices completely. One only does row operations until one finds an upper triangular matrix, that is, a matrix \([a_{i,j}]\) where \(a_{i,j} = 0\) if \(i > j\text{.}\) Computing determinant of such a matrix is not difficult (exercise).
Factorization into elementary matrices (or variations on elementary matrices) is useful in proofs involving an arbitrary linear operator, by reducing to a proof for an elementary matrix, similarly as the computation of the determinant.
Subsection 8.2.4 Exercises
Exercise 8.2.1.
For a vector space \(X\) with a norm \(\snorm{\cdot}\text{,}\) show that \(d(x,y) := \snorm{xy}\) makes \(X\) a metric space.
Exercise 8.2.2.
(Easy) Show that for square matrices \(A\) and \(B\text{,}\) \(\det(AB) = \det(BA)\text{.}\)
Exercise 8.2.3.
For \(x \in \R^n\text{,}\) define
sometimes called the sup or the max norm.
Show that \(\snorm{\cdot}_\infty\) is a norm on \(\R^n\) (defining a different distance).
What is the unit ball \(B(0,1)\) in this norm?
Exercise 8.2.4.
For \(x \in \R^n\text{,}\) define
sometimes called the \(1\)norm (or \(L^1\) norm).
Show that \(\snorm{\cdot}_1\) is a norm on \(\R^n\) (defining a different distance, sometimes called the taxicab distance).
What is the unit ball \(B(0,1)\) in this norm? Think about what it is in \({\mathbb{R}}^2\) and \(\mathbb{R}^3\text{.}\) Hint: It is, for example, a convex hull of a finite number of points.
Exercise 8.2.5.
Using the euclidean norm on \(\R^2\text{,}\) compute the operator norm of the operators in \(L(\R^2)\) given by the matrices:
a) \(\left[ \begin{smallmatrix} 1 & 0 \\ 0 & 2 \end{smallmatrix} \right]\) b) \(\left[ \begin{smallmatrix} 0 & 1 \\ 1 & 0 \end{smallmatrix} \right]\) c) \(\left[ \begin{smallmatrix} 1 & 1 \\ 0 & 1 \end{smallmatrix} \right]\) d) \(\left[ \begin{smallmatrix} 0 & 1 \\ 0 & 0 \end{smallmatrix} \right]\)
Exercise 8.2.6.
Using the standard euclidean norm \(\R^n\text{,}\) show:
Suppose \(A \in L(\R,\R^n)\) is defined for \(x \in \R\) by \(Ax := xa\) for a vector \(a \in \R^n\text{.}\) Then the operator norm \(\snorm{A}_{L(\R,\R^n)} = \snorm{a}_{\R^n}\text{.}\) (That is, the operator norm of \(A\) is the euclidean norm of \(a\text{.}\))
Suppose \(B \in L(\R^n,\R)\) is defined for \(x \in \R^n\) by \(Bx := b \cdot x\) for a vector \(b \in \R^n\text{.}\) Then the operator norm \(\snorm{B}_{L(\R^n,\R)} = \snorm{b}_{\R^n}\text{.}\)
Exercise 8.2.7.
Suppose \(\sigma = (\sigma_1,\sigma_2,\ldots,\sigma_n)\) is a permutation of \((1,2,\ldots,n)\text{.}\)
Show that we can make a finite number of transpositions (switching of two elements) to get to \((1,2,\ldots,n)\text{.}\)
Using the definition (8.4) show that \(\sigma\) is even if \(\operatorname{sgn}(\sigma) = 1\) and \(\sigma\) is odd if \(\operatorname{sgn}(\sigma) = 1\text{.}\) In particular, showing that being odd or even is welldefined.
Exercise 8.2.8.
Verify the computation of the determinant for the three types of elementary matrices.
Exercise 8.2.9.
Prove Proposition 8.2.11.
Exercise 8.2.10.
Suppose \(D = [d_{i,j}]\) is an \(n\)by\(n\) diagonal matrix, that is, \(d_{i,j} = 0\) whenever \(i \not= j\text{.}\) Show that \(\det(D) = d_{1,1}d_{2,2} \cdots d_{n,n}\text{.}\)
Suppose \(A\) is a diagonalizable matrix. That is, there exists a matrix \(B\) such that \(B^{1}AB = D\) for a diagonal matrix \(D = [d_{i,j}]\text{.}\) Show that \(\det(A) = d_{1,1}d_{2,2} \cdots d_{n,n}\text{.}\)
Exercise 8.2.11.
Take the vector space of polynomials \(\R[t]\) and the linear operator \(D \in L(\R[t])\) that is the differentiation (we proved in an earlier exercise that \(D\) is a linear operator). Given \(P(t) = c_0 + c_1 t + \cdots + c_n t^n \in \R[t]\) define \(\snorm{P} := \sup \bigl\{ \sabs{c_j} : j = 0,1,2,\ldots,n \bigr\}\text{.}\)
Show that \(\snorm{P}\) is a norm on \(\R[t]\text{.}\)
Show that \(D\) does not have bounded operator norm, that is \(\snorm{D} = \infty\text{.}\) Hint: Consider the polynomials \(t^n\) as \(n\) tends to infinity.
Exercise 8.2.12.
We finish the proof of Proposition 8.2.4. Let \(X\) be a finitedimensional normed vector space with basis \(\{ x_1,x_2,\ldots,x_n \}\text{.}\)

Show that \(f \colon \R^n \to \R\text{,}\)
\begin{equation*} f(c_1,c_2,\ldots,c_n) := \snorm{c_1 x_1 + c_2 x_2 + \cdots + c_n x_n} , \end{equation*}is continuous (the norm is the one on \(X\)).
Show that there exist numbers \(m\) and \(M\) such that if \(c = (c_1,c_2,\ldots,c_n) \in \R^n\) with \(\snorm{c} = 1\) (standard euclidean norm), then \(m \leq \snorm{c_1 x_1 + c_2 x_2 + \cdots + c_n x_n} \leq M\) (here the norm is on \(X\)).
Show that there exists a number \(B\) such that if \(\snorm{c_1 x_1 + c_2 x_2 + \cdots + c_n x_n}=1\text{,}\) then \(\sabs{c_j} \leq B\text{.}\)
Use part c) to show that if \(X\) is finitedimensional vector spaces and \(A \in L(X,Y)\text{,}\) then \(\snorm{A} < \infty\text{.}\)
Exercise 8.2.13.
Let \(X\) be a finitedimensional vector space with norm \(\snorm{\cdot}\) and basis \(\{ x_1,x_2,\ldots,x_n \}\text{.}\) Let \(c = (c_1,c_2,\ldots,c_n) \in \R^n\) and \(\snorm{c}\) be the standard euclidean norm on \(\R^n\text{.}\)

Prove that there exist positive numbers \(m,M > 0\) such that for all \(c \in \R^n\text{,}\)
\begin{equation*} m \snorm{c} \leq \snorm{c_1 x_1 + c_2 x_2 + \cdots + c_n x_n} \leq M \snorm{c} . \end{equation*}Hint: See previous exercise.

Use part a) to show that of \(\snorm{\cdot}_1\) and \(\snorm{\cdot}_2\) are two norms on \(X\text{,}\) then there exist positive numbers \(m,M > 0\) (perhaps different from above) such that for all \(x \in X\text{,}\) we have
\begin{equation*} m \snorm{x}_1 \leq \snorm{x}_2 \leq M \snorm{x}_1 . \end{equation*} Show that \(U \subset X\) is open in the metric defined by \(\norm{xy}_1\) if and only if \(U\) is open in the metric defined by \(\norm{xy}_2\text{.}\) In particular, convergence of sequences and continuity of functions is the same in either norm.
Exercise 8.2.14.
Let \(A\) be an upper triangular matrix. Find a formula for the determinant of \(A\) in terms of the diagonal entries, and prove that your formula works.
Exercise 8.2.15.
Given an \(n\)by\(n\) matrix \(A\text{,}\) prove that \(\sabs{\det(A)} \leq \snorm{A}^n\) (the norm on \(A\) is the operator norm). Hint: One way to do it is to first prove it in the case \(\snorm{A}=1\text{,}\) which means that all columns are of norm 1 or less, then prove that this means that \(\sabs{\det(A)} \leq 1\) using linearity.
Exercise 8.2.16.
Consider Proposition 8.2.6 where \(X=\R^n\) (for all \(n\)) using the euclidean norm.
Prove that the estimate \(\snorm{AB} < \frac{1}{\snorm{A^{1}}}\) is the best possible: For every \(A \in GL(\R^n)\text{,}\) find a \(B\) where equality is satisfied and \(B\) is not invertible. Hint: Difficulty is that \(\snorm{A}\snorm{A^{1}}\) is not always \(1\text{.}\) Prove that a vector \(x_1\) can be completed to a basis \(\{ x_1,\ldots,x_n \}\) such that \(x_1 \cdot x_j = 0\) for \(j \geq 2\text{.}\) For the right \(x_1\text{,}\) make it so that \((AB)x_j = 0\) for \(j \geq 2\text{.}\)
For every fixed \(A \in GL(\R^n)\text{,}\) let \(\sM\) denote the set of matrices \(B\) such that \(\snorm{AB} < \frac{1}{\snorm{A^{1}}}\text{.}\) Prove that while every \(B \in \sM\) is invertible, \(\snorm{B^{1}}\) is unbounded as a function of \(B\) on \(\sM\text{.}\)
Let \(A\) be an \(n\)by\(n\) matrix. A number \(\lambda \in \C\) (possibly complex even for a real matrix) is called an eigenvalue of \(A\) if there is a nonzero (possibly complex) vector \(x \in \C^n\) such that \(Ax = \lambda x\) (the multiplication by complex vectors is the same as for real vectors. In particular, if \(x = a+ib\) for real vectors \(a\) and \(b\text{,}\) and \(A\) is a real matrix, then \(Ax = Aa + i Ab\)). The number
is called the spectral radius of \(A\text{.}\) Here \(\abs{\lambda}\) is the complex modulus. We state without proof that at least one eigenvalue always exists, and there are no more than \(n\) distinct eigenvalues of \(A\text{.}\) You can therefore assume that \(0 \leq \rho(A) < \infty\text{.}\) The exercises below hold for complex matrices, but feel free to assume they are real matrices.
Exercise 8.2.17.
Let \(A,S\) be \(n\)by\(n\) matrices, where \(S\) is invertible. Prove that \(\lambda\) is an eigenvalue of \(A\text{,}\) if and only if it is an eigenvalue of \(S^{1}AS\text{.}\) Then prove that \(\rho(S^{1}AS) = \rho(S)\text{.}\) In particular, \(\rho\) is a welldefined function on \(L(X)\) for every finitedimensional vector space \(X\text{.}\)
Exercise 8.2.18.
Let \(A\) be an \(n\)by\(n\) matrix \(A\text{.}\)
Prove \(\rho(A) \leq \snorm{A}\text{.}\)
For every \(k \in \N\text{,}\) prove \(\rho(A) \leq \snorm{A^k}^{1/k}\text{.}\)
Suppose \(\displaystyle \lim_{k\to\infty} A^k = 0\) (limit in the operator norm). Prove that \(\rho(A) < 1\text{.}\)
Exercise 8.2.19.
We say a set \(C \subset \R^n\) is symmetric if \(x \in C\) implies \(x \in C\text{.}\)
Let \(\snorm{\cdot}\) be any given norm on \(\R^n\text{.}\) Show that the closed unit ball \(C(0,1)\) (using the metric induced by this norm) is a compact symmetric convex set.

(Challenging) Let \(C \subset \R^n\) be a compact symmetric convex set and \(0 \in C\text{.}\) Show that
\begin{equation*} \snorm{x} := \inf \left\{ \lambda : \lambda > 0 \text{ and } \frac{x}{\lambda} \in C \right\} \end{equation*}is a norm on \(\R^n\text{,}\) and \(C = C(0,1)\) (the closed unit ball) in the metric induced by this norm.
Hint: Feel free to the result of Exercise 8.2.13 part c).