Linear Algebra - in a Nutshell

We begin by setting up various pieces of background material related to linear algebra. The theory here is of course not explained in the most generality, but simplified and adapted to our discussion. Denote first the set of all real numbers by R. A vector of length n is simply a list of n elements, called vector entries. The notation of a (column) vector v having entries x₁, x₂, … ,x_n is the following:

However, we will only look at vectors whose entries are real numbers (that is x₁, x₂, … ,x_n are all real numbers). As in the case of real numbers, real vectors can be added or multiplied by a real number (called scalar) in a straightforward way:

In a similar way we define the notion of a matrix. An m × n matrix is a rectangular array of entries having hight m and width n. This can also be seen as stacking n vectors of length m one next to the other. Again, take the entries to be real numbers. We denote an m × n matrix A with entries a_ij by:

This means that on row i and column j we find the real number a_ij. For example, on the first row, second column lies element a₁₂. Notice that a vector of length m is just a m × 1 matrix and a real number is a 1 × 1 matrix. If m = n then the matrix is called a square matrix, of size n. This is what we will consider from now on. It makes sense now to talk about the diagonal of a matrix, which consists of those elements for which the row number equals the column number ( the elements a₁₁, a₂₂, up to a_nn).

Notice that the last two matrices are rather "special": they are symmetric. This means that entry on row i and column j is equal to that entry on row j and column i (or otherwise said a_ij = a_ji). As an easy exercise, prove that any diagonal matrix (has zeros everywhere, except on the diagonal) is symmetric. The matrix that has ones on the diagonal and zeros everywhere else is called the identity matrix and is denoted by I. Matrix multiplication is done in the following general way:

The other two matrix operations, addition and scalar multiplication, are done as in the case of vectors. Adding matrices A and B gives a matrix C which has the entry on row i and column j equal to the sum of the corresponding entries of A and B. Multiplying matrix A with a real number a is the same as multiplying every element of A with a.

Problem #2: Consider the following three matrices:

What is the value of A+B, B·C, and A·B?
Show that (A+B)+C = A+(B+C) and that (A·B)·C=A·(B·C).
Find the product of A with the first column of C.

One can observe that there is some sort of similarity between matrices A and C in Problem 2. above. The similarity comes from the fact that the entry on row i and column j in matrix A is equal to the entry on row j and column i in matrix C. Of course, the elements on the diagonal are the same. We then say that A is the transpose of C, or equivalently that C is the transpose of A. In general, for a matrix A we denote its transpose by A^t. More intuitively, given a matrix we find its transpose by interchanging the element at row i, column j with the element at row j, column i. If we do this twice we notice that the transpose of the transpose of a matrix is the matrix itself, or (A^t)^t=A.

We now introduce two important notions in the theory about matrices: eigenvector and eigenvalue. We say that the real number z is an eigenvalue for A if there exists a real vector v of length n such that A·v = z·v. Such a vector v is called an eigenvector corresponding to the eigenvalue z. This is not the most general definition, but it will suffice for our purposes. In general eigenvalues and eigenvectors are complex, and not real. If we assume that A is a (real) symmetric matrix of size n, then we know that it has n real eigenvalues and all eigenvectors are also real. In fact, a matrix of size n can have at most n real eigenvalues.

In order to make these definitions more clear, consider the following explicit example:

Of course, in general a matrix A and its transpose A^t do not have the same eigenvectors that correspond to the common eigenvalues. For the matrix in the above example,

has eigenvalue z = 3 but the corresponding eigenvector is

. This follows from the computation below

An important observation is that a matrix A may (in most cases) have more than one eigenvector corresponding to an eigenvalue. These eigenvectors that correspond to the same eigenvalue may have no relation to one another. They can however be related, as for example if one is a scalar multiple of another. More precisely, in the last example, the vector whose entries are 0 and 1 is an eigenvector, but also the vector whose entries are 0 and 2 is an eigenvector. It is a good exercise to check this by direct computation as shown in Example 5.

A matrix is called column-stochastic if all of its entries are greater or equal to zero (nonnegative) and the sum of the entries in each column is equal to 1. If all entries of a matrix are nonnegative, then we say that the matrix itself is nonnegative. Furthermore, a matrix is positive is all its entries are positive (greater than zero) real numbers.

Example 6: Consider a matrix A with transpose A^t:

It is easy to see that A is column-stochastic, while A^t is not. However, the sum of the elements on each row of A^t is equal to 1. We first show that z = 1 is an eigenvalue for A^t, with corresponding eigenvector

This is true since A^t·v = 1·v. Then, from Fact 3, 1 is an eigenvalue for the matrix A as well.
Next we wish to find an eigenvector that corresponds to this eigenvalue for A. Let u be such an eigenvector. In order to find u explicitly we write the equation

x₁, x₂, x₃, and x₄ are all real numbers that we don't yet know. If we multiply term by term we find that

Substituting

in the third relation we obtain

and so the vector u is of the form

As x₁ is just a real number (hence a scalar) we can take x₁ = 12 and we have just proved that the vector whose entries are 12, 4, 9, and 6 (from top to bottom) is an eigenvector for A.

In the first part of the previous example we have just shown 1 is an eigenvalue for that particular case. However, this is true for any column-stochastic matrix, as stated below.

Notice also that the eigenvector that we found in the second part of the example above is rather "special" itself. We have chosen x₁ = 12 and we obtained an eigenvector with positive entries. If however we choose x₁ = -12 then we obtain an eigenvector with negative entries (smaller than 0).

When we are working with positive column-stochastic matrices A it is possible to find an eigenvector v associated to the eigenvalue z = 1 such that all its entries are positive. Hence A·v = v and the entries of v are all positive.

Lecture #1: Linear Algebra - in a Nutshell

Back

Table of Contents

Next