## Section 3.2 Matrices and linear systems

¶*Note: 1.5 lectures, first part of §5.1 in [EP], §7.2 and §7.3 in [BD], see also \appendixref{linalg:appendix}*

### Subsection 3.2.1 Matrices and vectors

Before we start talking about linear systems of ODEs, we need to talk about matrices, so let us review these briefly. A *matrix* is an \(m
\times n\) array of numbers (\(m\) rows and \(n\) columns). For example, we denote a \(3 \times 5\) matrix as follows

The numbers \(a_{ij}\) are called *elements* or *entries*.

By a *vector* we usually mean a *column vector*, that is an \(m \times 1\) matrix. If we mean a *row vector*, we will explicitly say so (a row vector is a \(1 \times n\) matrix). We usually denote matrices by upper case letters and vectors by lower case letters with an arrow such as \(\vec{x}\) or \(\vec{b}\text{.}\) By \(\vec{0}\) we mean the vector of all zeros.

We define some operations on matrices. We want \(1 \times 1\) matrices to really act like numbers, so our operations have to be compatible with this viewpoint.

First, we can multiply a matrix by a *scalar* (a number). We simply multiply each entry in the matrix by the scalar. For example,

Matrix addition is also easy. We add matrices element by element. For example,

If the sizes do not match, then addition is not defined.

If we denote by 0 the matrix with all zero entries, by \(c\text{,}\) \(d\) scalars, and by \(A\text{,}\) \(B\text{,}\) \(C\) matrices, we have the following familiar rules:

Another useful operation for matrices is the so-called *transpose*. This operation just swaps rows and columns of a matrix. The transpose of \(A\) is denoted by \(A^T\text{.}\) Example:

### Subsection 3.2.2 Matrix multiplication

Let us now define matrix multiplication. First we define the so-called *dot product* (or *inner product*) of two vectors. Usually this will be a row vector multiplied with a column vector of the same size. For the dot product we multiply each pair of entries from the first and the second vector and we sum these products. The result is a single number. For example,

And similarly for larger (or smaller) vectors.

Armed with the dot product we define the *product of matrices*. First let us denote by \(\operatorname{row}_i(A)\) the \(i^{\text{th}}\) row of \(A\) and by \(\operatorname{column}_j(A)\) the \(j^{\text{th}}\) column of \(A\text{.}\) For an \(m \times n\) matrix \(A\) and an \(n \times p\) matrix \(B\) we can define the product \(AB\text{.}\) We let \(AB\) be an \(m \times p\) matrix whose \(ij^{\text{th}}\) entry is the dot product

Do note how the sizes match up: \(m \times n\) multiplied by \(n \times p\) is \(m \times p\text{.}\) Example:

For multiplication we want an analogue of a 1. This analogue is the so-called *identity matrix*. The identity matrix is a square matrix with 1s on the diagonal and zeros everywhere else. It is usually denoted by \(I\text{.}\) For each size we have a different identity matrix and so sometimes we may denote the size as a subscript. For example, the \(I_3\) would be the \(3 \times 3\) identity matrix

We have the following rules for matrix multiplication. Suppose that \(A\text{,}\) \(B\text{,}\) \(C\) are matrices of the correct sizes so that the following make sense. Let \(\alpha\) denote a scalar (number).

A few warnings are in order.

\(AB \not= BA\) in general (it may be true by fluke sometimes). That is, matrices do not commute. For example, take \(A = \left[ \begin{smallmatrix} 1 & 1 \\ 1 & 1 \end{smallmatrix} \right]\) and \(B = \left[ \begin{smallmatrix} 1 & 0 \\ 0 & 2 \end{smallmatrix} \right]\text{.}\)

\(AB = AC\) does not necessarily imply \(B=C\text{,}\) even if \(A\) is not 0.

\(AB = 0\) does not necessarily mean that \(A=0\) or \(B=0\text{.}\) Try, for example, \(A = B = \left[ \begin{smallmatrix} 0 & 1 \\ 0 & 0 \end{smallmatrix} \right]\text{.}\)

For the last two items to hold we would need to “divide” by a matrix. This is where the *matrix inverse* comes in. Suppose that \(A\) and \(B\) are \(n \times n\) matrices such that

Then we call \(B\) the inverse of \(A\) and we denote \(B\) by \(A^{-1}\text{.}\) If the inverse of \(A\) exists, then we call \(A\) *invertible*. If \(A\) is not invertible, we sometimes say \(A\) is *singular*.

If \(A\) is invertible, then \(AB = AC\) does imply that \(B = C\) (in particular the inverse of \(A\) is unique). We just multiply both sides by \(A^{-1}\) (on the left) to get \(A^{-1}AB = A^{-1}AC\) or \(IB=IC\) or \(B=C\text{.}\) It is also not hard to see that \({(A^{-1})}^{-1} = A\text{.}\)

### Subsection 3.2.3 The determinant

For square matrices we define a useful quantity called the *determinant*. We define the determinant of a \(1 \times 1\) matrix as the value of its only entry. For a \(2 \times 2\) matrix we define

Before trying to define the determinant for larger matrices, let us note the meaning of the determinant. Consider an \(n \times n\) matrix as a mapping of the \(n\)-dimensional euclidean space \({\mathbb{R}}^n\) to itself, where \(\vec{x}\) gets sent to \(A \vec{x}\text{.}\) In particular, a \(2 \times 2\) matrix \(A\) is a mapping of the plane to itself. The determinant of \(A\) is the factor by which the area of objects changes. If we take the unit square (square of side 1) in the plane, then \(A\) takes the square to a parallelogram of area \(\lvert\det(A)\rvert\text{.}\) The sign of \(\det(A)\) denotes changing of orientation (negative if the axes get flipped). For example, let

Then \(\det(A) = 1+1 = 2\text{.}\) Let us see where the (unit) square with vertices \((0,0)\text{,}\) \((1,0)\text{,}\) \((0,1)\text{,}\) and \((1,1)\) gets sent. Clearly \((0,0)\) gets sent to \((0,0)\text{.}\)

The image of the square is another square with vertices \((0,0)\text{,}\) \((1,-1)\text{,}\) \((1,1)\text{,}\) and \((2,0)\text{.}\) The image square has a side of length \(\sqrt{2}\) and is therefore of area 2.

If you think back to high school geometry, you may have seen a formula for computing the area of a parallelogram with vertices \((0,0)\text{,}\) \((a,c)\text{,}\) \((b,d)\) and \((a+b,c+d)\text{.}\) And it is precisely

The vertical lines above mean absolute value. The matrix \(\left[ \begin{smallmatrix} a & b \\ c & d \end{smallmatrix} \right]\) carries the unit square to the given parallelogram.

Let us look at the determinant for larger matrices. We define \(A_{ij}\) as the matrix \(A\) with the \(i^{\text{th}}\) row and the \(j^{\text{th}}\) column deleted. To compute the determinant of a matrix, pick one row, say the \(i^{\text{th}}\) row and compute:

For the first row we get

We alternately add and subtract the determinants of the submatrices \(A_{ij}\) multiplied by \(a_{ij}\) for a fixed \(i\) and all \(j\text{.}\) For a \(3 \times 3\) matrix, picking the first row, we get \(\det (A) = a_{11} \det (A_{11}) - a_{12} \det (A_{12}) + a_{13} \det (A_{13})\text{.}\) For example,

The numbers \({(-1)}^{i+j}\det(A_{ij})\) are called *cofactors* of the matrix and this way of computing the determinant is called the *cofactor expansion*. No matter which row you pick, you always get the same number. It is also possible to compute the determinant by expanding along columns (picking a column instead of a row above). It is true that \(\det(A) = \det(A^T)\text{.}\)

A common notation for the determinant is a pair of vertical lines:

I personally find this notation confusing as vertical lines usually mean a positive quantity, while determinants can be negative. Also think about how to write the absolute value of a determinant. I will not use this notation in this book.

Think of the determinants telling you the scaling of a mapping. If \(B\) doubles the sizes of geometric objects and \(A\) triples them, then \(AB\) (which applies \(B\) to an object and then \(A\)) should make size go up by a factor of \(6\text{.}\) This is true in general:

This property is one of the most useful, and it is employed often to actually compute determinants. A particularly interesting consequence is to note what it means for existence of inverses. Take \(A\) and \(B\) to be inverses of each other, that is \(AB=I\text{.}\) Then

Neither \(\det(A)\) nor \(\det(B)\) can be zero. Let us state this as a theorem as it will be very important in the context of this course.

###### Theorem 3.2.1.

An \(n \times n\) matrix \(A\) is invertible if and only if \(\det (A) \not= 0\text{.}\)

In fact, \(\det(A^{-1}) \det(A) = 1\) says that \(\det(A^{-1}) = \frac{1}{\det(A)}\text{.}\) So we even know what the determinant of \(A^{-1}\) is before we know how to compute \(A^{-1}\text{.}\)

There is a simple formula for the inverse of a \(2 \times 2\) matrix

Notice the determinant of the matrix \([\begin{smallmatrix}a&b\\c&d\end{smallmatrix}]\) in the denominator of the fraction. The formula only works if the determinant is nonzero, otherwise we are dividing by zero.

### Subsection 3.2.4 Solving linear systems

One application of matrices we will need is to solve systems of linear equations. This is best shown by example. Suppose that we have the following system of linear equations

Without changing the solution, we could swap equations in this system, we could multiply any of the equations by a nonzero number, and we could add a multiple of one equation to another equation. It turns out these operations always suffice to find a solution.

It is easier to write the system as a matrix equation. The system above can be written as

To solve the system we put the coefficient matrix (the matrix on the left-hand side of the equation) together with the vector on the right and side and get the so-called *augmented matrix*

We apply the following three elementary operations.

Swap two rows.

Multiply a row by a nonzero number.

Add a multiple of one row to another row.

We keep doing these operations until we get into a state where it is easy to read off the answer, or until we get into a contradiction indicating no solution, for example if we come up with an equation such as \(0=1\text{.}\)

Let us work through the example. First multiply the first row by \(\nicefrac{1}{2}\) to obtain

Now subtract the first row from the second and third row.

Multiply the last row by \(\nicefrac{1}{3}\) and the second row by \(\nicefrac{1}{2}\text{.}\)

Swap rows 2 and 3.

Subtract the last row from the first, then subtract the second row from the first.

If we think about what equations this augmented matrix represents, we see that \(x_1 = -4\text{,}\) \(x_2 = 3\text{,}\) and \(x_3 = 2\text{.}\) We try this solution in the original system and, voilà, it works!

###### Exercise 3.2.1.

Check that the solution above really solves the given equations.

We write this equation in matrix notation as

where \(A\) is the matrix \(\left[ \begin{smallmatrix} 2 & 2 & 2 \\ 1 & 1 & 3 \\ 1 & 4 & 1 \end{smallmatrix} \right]\) and \(\vec{b}\) is the vector \(\left[ \begin{smallmatrix} 2 \\ 5 \\ 10 \end{smallmatrix} \right]\text{.}\) The solution can also be computed via the inverse,

It is possible that the solution is not unique, or that no solution exists. It is easy to tell if a solution does not exist. If during the row reduction you come up with a row where all the entries except the last one are zero (the last entry in a row corresponds to the right-hand side of the equation), then the system is *inconsistent* and has no solution. For example, for a system of 3 equations and 3 unknowns, if you find a row such as \([\,0 \quad 0 \quad 0 ~\,|\,~ 1\,]\) in the augmented matrix, you know the system is inconsistent. That row corresponds to \(0=1\text{.}\)

You generally try to use row operations until the following conditions are satisfied. The first (from the left) nonzero entry in each row is called the *leading entry*.

The leading entry in any row is strictly to the right of the leading entry of the row above.

Any zero rows are below all the nonzero rows.

All leading entries are 1.

All the entries above and below a leading entry are zero.

Such a matrix is said to be in *reduced row echelon form*. The variables corresponding to columns with no leading entries are said to be *free variables*. Free variables mean that we can pick those variables to be anything we want and then solve for the rest of the unknowns.

###### Example 3.2.1.

The following augmented matrix is in reduced row echelon form.

Suppose the variables are \(x_1\text{,}\) \(x_2\text{,}\) and \(x_3\text{.}\) Then \(x_2\) is the free variable, \(x_1 = 3 - 2x_2\text{,}\) and \(x_3 = 1\text{.}\)

On the other hand if during the row reduction process you come up with the matrix

there is no need to go further. The last row corresponds to the equation \(0 x_1 + 0 x_2 + 0 x_3 = 3\text{,}\) which is preposterous. Hence, no solution exists.

### Subsection 3.2.5 Computing the inverse

If the matrix \(A\) is square and there exists a unique solution \(\vec{x}\) to \(A \vec{x} = \vec{b}\) for any \(\vec{b}\) (there are no free variables), then \(A\) is invertible. Multiplying both sides by \(A^{-1}\text{,}\) you can see that \(\vec{x} = A^{-1} \vec{b}\text{.}\) So it is useful to compute the inverse if you want to solve the equation for many different right-hand sides \(\vec{b}\text{.}\)

We have a formula for the \(2 \times 2\) inverse, but it is also not hard to compute inverses of larger matrices. While we will not have too much occasion to compute inverses for larger matrices than \(2 \times 2\) by hand, let us touch on how to do it. Finding the inverse of \(A\) is actually just solving a bunch of linear equations. If we can solve \(A \vec{x}_k = \vec{e}_k\) where \(\vec{e}_k\) is the vector with all zeros except a 1 at the \(k^{\text{th}}\) position, then the inverse is the matrix with the columns \(\vec{x}_k\) for \(k=1,2,\ldots,n\) (exercise: why?). Therefore, to find the inverse we write a larger \(n \times 2n\) augmented matrix \([ \,A ~|~ I\, ]\text{,}\) where \(I\) is the identity matrix. We then perform row reduction. The reduced row echelon form of \([ \,A ~|~ I\, ]\) will be of the form \([ \,I ~|~ A^{-1}\, ]\) if and only if \(A\) is invertible. We then just read off the inverse \(A^{-1}\text{.}\)

### Subsection 3.2.6 Exercises

###### Exercise 3.2.2.

Solve \(\left[ \begin{smallmatrix} 1 & 2 \\ 3 & 4 \end{smallmatrix} \right] \vec{x} = \left[ \begin{smallmatrix} 5 \\ 6 \end{smallmatrix} \right]\) by using matrix inverse.

###### Exercise 3.2.3.

Compute determinant of \(\left[ \begin{smallmatrix} 9 & -2 & -6 \\ -8 & 3 & 6 \\ 10 & -2 & -6 \end{smallmatrix} \right]\text{.}\)

###### Exercise 3.2.4.

Compute determinant of \(\left[ \begin{smallmatrix} 1 & 2 & 3 & 1 \\ 4 & 0 & 5 & 0 \\ 6 & 0 & 7 & 0 \\ 8 & 0 & 10 & 1 \end{smallmatrix} \right]\text{.}\) Hint: Expand along the proper row or column to make the calculations simpler.

###### Exercise 3.2.5.

Compute inverse of \(\left[ \begin{smallmatrix} 1 & 2 & 3 \\ 1 & 1 & 1 \\ 0 & 1 & 0 \end{smallmatrix} \right]\text{.}\)

###### Exercise 3.2.6.

For which \(h\) is \(\left[ \begin{smallmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & h \end{smallmatrix} \right]\) not invertible? Is there only one such \(h\text{?}\) Are there several? Infinitely many?

###### Exercise 3.2.7.

For which \(h\) is \(\left[ \begin{smallmatrix} h & 1 & 1 \\ 0 & h & 0 \\ 1 & 1 & h \end{smallmatrix} \right]\) not invertible? Find all such \(h\text{.}\)

###### Exercise 3.2.8.

Solve \(\left[ \begin{smallmatrix} 9 & -2 & -6 \\ -8 & 3 & 6 \\ 10 & -2 & -6 \end{smallmatrix} \right] \vec{x} = \left[ \begin{smallmatrix} 1 \\ 2 \\ 3 \end{smallmatrix} \right]\) .

###### Exercise 3.2.9.

Solve \(\left[ \begin{smallmatrix} 5 & 3 & 7 \\ 8 & 4 & 4 \\ 6 & 3 & 3 \end{smallmatrix} \right] \vec{x} = \left[ \begin{smallmatrix} 2 \\ 0 \\ 0 \end{smallmatrix} \right]\text{.}\)

###### Exercise 3.2.10.

Solve \(\left[ \begin{smallmatrix} 3 & 2 & 3 & 0 \\ 3 & 3 & 3 & 3 \\ 0 & 2 & 4 & 2 \\ 2 & 3 & 4 & 3 \end{smallmatrix} \right] \vec{x} = \left[ \begin{smallmatrix} 2 \\ 0 \\ 4 \\ 1 \end{smallmatrix} \right]\text{.}\)

###### Exercise 3.2.11.

Find 3 nonzero \(2 \times 2\) matrices \(A\text{,}\) \(B\text{,}\) and \(C\) such that \(AB = AC\) but \(B \not= C\text{.}\)

###### Exercise 3.2.101.

Compute determinant of \(\left[ \begin{smallmatrix} 1 & 1 & 1 \\ 2 & 3 & -5 \\ 1 & -1 & 0 \end{smallmatrix}\right]\)

\(-15\)

###### Exercise 3.2.102.

Find \(t\) such that \(\left[ \begin{smallmatrix} 1 & t \\ -1 & 2 \end{smallmatrix}\right]\) is not invertible.

\(-2\)

###### Exercise 3.2.103.

Solve \(\left[ \begin{smallmatrix} 1 & 1 \\ 1 & -1 \end{smallmatrix}\right] \vec{x} = \left[ \begin{smallmatrix} 10 \\ 20 \end{smallmatrix}\right]\text{.}\)

\(\vec{x} = \left[ \begin{smallmatrix} 15 \\ -5 \end{smallmatrix}\right]\)

###### Exercise 3.2.104.

Suppose \(a, b, c\) are nonzero numbers. Let \(M=\left[ \begin{smallmatrix} a & 0 \\ 0 & b \end{smallmatrix}\right]\text{,}\) \(N=\left[ \begin{smallmatrix} a & 0 & 0 \\ 0 & b & 0 \\ 0 & 0 & c \end{smallmatrix}\right]\text{.}\)

Compute \(M^{-1}\text{.}\)

Compute \(N^{-1}\text{.}\)

a) \(\left[ \begin{smallmatrix} \nicefrac{1}{a} & 0 \\ 0 & \nicefrac{1}{b} \end{smallmatrix}\right]\) b) \(\left[ \begin{smallmatrix} \nicefrac{1}{a} & 0 & 0 \\ 0 & \nicefrac{1}{b} & 0 \\ 0 & 0 & \nicefrac{1}{c} \end{smallmatrix}\right]\)