[Notes on Diffy Qs home] [PDF version] [Buy cheap paperback version]
[next] [prev] [prev-tail] [tail] [up]
Note: 1 and a half lectures, first part of §5.1 in [EP], §7.2 and §7.3 in [BD]
Before we can start talking about linear systems of ODEs, we will need to talk about matrices, so let us review these briefly. A matrix is an
array of numbers (
rows and
columns). For example, we denote a
matrix as follows
![]() |
By a vector we will usually mean a column vector, that is an
matrix. If we mean a row vector we will explicitly say so (a row vector is a
matrix). We will usually denote matrices by upper case letters and vectors by lower case letters with an arrow such as
or
. By
we will mean the vector of all zeros.
It is easy to define some operations on matrices. Note that we will want
matrices to really act like numbers, so our operations will have to be compatible with this viewpoint.
First, we can multiply by a scalar (a number). This means just multiplying each entry by the same number. For example,
|
|
Matrix addition is also easy. We add matrices element by element. For example,
|
|
If the sizes do not match, then addition is not defined.
If we denote by 0 the matrix of with all zero entries, by
,
scalars, and by
,
,
matrices, we have the following familiar rules.

Another useful operation for matrices is the so-called transpose. This operation just swaps rows and columns of a matrix. The transpose of
is denoted by
. Example:
![]() |
Let us now define matrix multiplication. First we define the so-called dot product (or inner product) of two vectors. Usually this will be a row vector multiplied with a column vector of the same size. For the dot product we multiply each pair of entries from the first and the second vector and we sum these products. The result is a single number. For example,
![]() |
And similarly for larger (or smaller) vectors.
Armed with the dot product we can define the product of matrices. First let us denote by
the
row of
and by
the
column of
. For an
matrix
and an
matrix
we can define the product
. We let
be an
matrix whose
entry is
|
|
Do note how the sizes match up. Example:
![]() |
For multiplication we want an analogue of a 1. This analogue is the so-called identity matrix. The identity matrix is a square matrix with 1s on the main diagonal and zeros everywhere else. It is usually denoted by
. For each size we have a different identity matrix and so sometimes we may denote the size as a subscript. For example, the
would be the
identity matrix
![]() |
We have the following rules for matrix multiplication. Suppose that
,
,
are matrices of the correct sizes so that the following make sense. Let
denote a scalar (number).

A few warnings are in order.
For the last two items to hold we would need to “divide” by a matrix. This is where the matrix inverse comes in. Suppose that
and
are
matrices such that
|
|
Then we call
the inverse of
and we denote
by
. If the inverse of
exists, then we call
invertible. If
is not invertible we sometimes say
is singular.
If
is invertible, then
does imply that
(in particular the inverse of
is unique). We just multiply both sides by
to get
or
or
. It is also not hard to see that
.
We can now talk about determinants of square matrices. We define the determinant of a
matrix as the value of its only entry. For a
matrix we define
|
|
Before trying to compute the determinant for larger matrices, let us first note the meaning of the determinant. Consider an
matrix as a mapping of the
dimensional euclidean space
to
. In particular, a
matrix
is a mapping of the plane to itself, where
gets sent to
. Then the determinant of
is the factor by which the area of objects gets changed. If we take the unit square (square of side 1) in the plane, then
takes the square to a parallelogram of area
. The sign of
denotes changing of orientation (negative if the axes got flipped). For example, let
|
|
Then
. Let us see where the square with vertices
,
,
, and
gets sent. Clearly
gets sent to
.
|
|
So the image of the square is another square. The image square has a side of length
and is therefore of area 2.
If you think back to high school geometry, you may have seen a formula for computing the area of a parallelogram with vertices
,
,
and
. And it is precisely
|
|
The vertical lines above mean absolute value. The matrix
carries the unit square to the given parallelogram.
Now we can define the determinant for larger matrices. We define
as the matrix
with the
row and the
column deleted. To compute the determinant of a matrix, pick one row, say the
row and compute.
![]() |
For the first row we get
![]() |
We alternately add and subtract the determinants of the submatrices
for a fixed
and all
. For a
matrix, picking the first row, we would get
. For example,
![]() |
The numbers
are called cofactors of the matrix and this way of computing the determinant is called the cofactor expansion. It is also possible to compute the determinant by expanding along columns (picking a column instead of a row above).
Note that a common notation for the determinant is a pair of vertical lines:
|
|
I personally find this notation confusing as vertical lines usually mean a positive quantity, while determinants can be negative. I will not use this notation in this book.
One of the most important properties of determinants (in the context of this course) is the following theorem.
In fact, there is a formula for the inverse of a
matrix
|
|
Notice the determinant of the matrix in the denominator of the fraction. The formula only works if the determinant is nonzero, otherwise we are dividing by zero.
One application of matrices we will need is to solve systems of linear equations. This is best shown by example. Suppose that we have the following system of linear equations

Without changing the solution, we could swap equations in this system, we could multiply any of the equations by a nonzero number, and we could add a multiple of one equation to another equation. It turns out these operations always suffice to find a solution.
It is easier to write the system as a matrix equation. Note that the system can be written as
![]() |
To solve the system we put the coefficient matrix (the matrix on the left hand side of the equation) together with the vector on the right and side and get the so-called augmented matrix
![]() |
We apply the following three elementary operations.
We will keep doing these operations until we get into a state where it is easy to read off the answer, or until we get into a contradiction indicating no solution, for example if we come up with an equation such as
.
Let us work through the example. First multiply the first row by
to obtain
![]() |
Now subtract the first row from the second and third row.
![]() |
Multiply the last row by
and the second row by
.
![]() |
Swap rows 2 and 3.
![]() |
Subtract the last row from the first, then subtract the second row from the first.
![]() |
If we think about what equations this augmented matrix represents, we see that
,
, and
. We try this solution in the original system and, voilà, it works!
If we write this equation in matrix notation as
|
|
where
is the matrix
and
is the vector
. The solution can be also computed via the inverse,
|
|
One last note to make about linear systems of equations is that it is possible that the solution is not unique (or that no solution exists). It is easy to tell if a solution does not exist. If during the row reduction you come up with a row where all the entries except the last one are zero (the last entry in a row corresponds to the right hand side of the equation) the system is inconsistent and has no solution. For example if for a system of 3 equations and 3 unknowns you find a row such as
in the augmented matrix, you know the system is inconsistent.
You generally try to use row operations until the following conditions are satisfied. The first nonzero entry in each row is called the leading entry.
Such a matrix is said to be in reduced row echelon form. The variables corresponding to columns with no leading entries are said to be free variables. Free variables mean that we can pick those variables to be anything we want and then solve for the rest of the unknowns.
Example 3.2.1: The following augmented matrix is in reduced row echelon form.
![]() |
Suppose the variables are
,
, and
. Then
is the free variable,
, and
.
On the other hand if during the row reduction process you come up with the matrix
![]() |
there is no need to go further. The last row corresponds to the equation
, which is preposterous. Hence, no solution exists.
If the coefficient matrix is square and there exists a unique solution
to
for any
, then
is invertible. In fact by multiplying both sides by
you can see that
. So it is useful to compute the inverse if you want to solve the equation for many different right hand sides
.
The
inverse can be given by a formula, but it is also not hard to compute inverses of larger matrices. While we will not have too much occasion to compute inverses for larger matrices than
by hand, let us touch on how to do it. Finding the inverse of
is actually just solving a bunch of linear equations. If we can solve
where
is the vector with all zeros except a 1 at the
position, then the inverse is the matrix with the columns
for
(exercise: why?). Therefore, to find the inverse we can write a larger
augmented matrix
, where
is the identity. We then perform row reduction. The reduced row echelon form of
will be of the form
if and only if
is invertible. We can then just read off the inverse
.
Exercise 3.2.4: Compute determinant of
. Hint: Expand along the proper row or column to make the calculations simpler.