Tuesday, October 11, 2011

Lecture 2

Following this lecture you should be able to do Example Sheet 1, #1−9 (except 8 (b), although you can probably guess what is meant).

Someone has rightly pointed out that at the bottom of page 2 of the notes, the line should end p_{ i_{i−1} i_n }. This has now been corrected into the notes.

Following the lecture someone asked me a question about the eigenvectors of matrices, specifically, "Is the set of eigenvalues obtained from left-hand eigenvectors the same as that obtained from right-hand eigenvectors?" (The answer is yes.) This question tells me that I might have referred to facts of linear algebra that are fairly new to you, or only briefly covered in the IA Vectors and Matrices course. In the IB Linear Algebra course you will learn more. For example, the schedules include "Algebraic and geometric multiplicity of eigenvalues. Statement and illustration of Jordan normal form." It will be interesting for you to think again about calculation of P^n once you know that it is always possible to write P=UJU^{−1}, where J is an almost diagonal matrix and U is a matrix whose rows are left-hand eigenvectors. Having said this, we shall not actually need any advanced results from linear algebra in our course. Most of our proofs are probabilistic rather than matrix-algebraic. Today's discussion of P^n is perhaps the one exception, since if you wish to fully understand the solution of the recurrence relations when the characteristic polynomial has repeated roots then the representation P=UJU^{−1} is helpful.

This lecture was mostly about how to calculate the elements of P^n by solving recurence relations. We ended the lecture with definitions of "i communicates with j", the idea of class, and closed and open classes. If the Markov chain consists of only one class (and so every state can be reached from every other) then the Markov chain is said to be irreducible.

Notice that if P is m x m and irreducible then Q=(1/m)(I+P+P^2+···+P^{m-1}) is a transition matrix all of whose elements are positive (can you see why? A hint is the pigeonhole principle).

Here now is a sidebar on some interesting results in matrix algebra that are related to today's topics. We said in this lecture that if P is m x m and has m distinct eigenvalues, 1, mu_2, ..., mu_m, then

p_{ij}^{(n)} = a_1 + a_2 mu_2^n + ··· + mu_m^n

for some constants a_1, a_2, ..., a_m.

We would like to know more about the eigenvalues mu_2, ...., mu_m. In particular, let |mu_j| denotes the modulus of mu_j. If |mu_j|<1 for all j>1 then p_{ij}^{(n)} tends to a_1 as n tends to infinity (as we see happening in Example 2.1)
We claim the following.
  1. |mu_j| ≤ 1 for all j>1.
  2. Suppose there exists n such that P^n is strictly positive. Then |mu_j|<1 for all j>1, and so p_{ij}^{(n)} tends to a_1 as n tends to infinity.
These facts are consequences of Perron-Frebonius theory. This theory (dating from about 100 years ago and useful in many branches of mathematics) says the following.

Suppose A is square matrix, which is non-negative and irreducible (in the sense that for all i,j , we have (A^n)_{ij}>0 for some n). Then there exists a positive real number, say lambda, such that (i) lambda is an eigenvalue of A, (ii) lambda has multiplicity 1, (iii) both the left and right-hand eigenvectors corresponding to lambda are strictly positive, (iv) no other eigenvector of A is strictly positive, (v) all eigenvalues of A are in modulus no greater than lambda, (vi) if, moreover, A is strictly positive then all other eigenvalues of A are in modulus strictly less than lambda, and
(vii): min_i sum_j a_{ij} ≤ lambda≤ max_i sum_j a_{ij}.
i.e. lambda lies between the minimum and maximum of row sums of A.

So in the case that A =P (i.e. the transition matrix of an irreducible Markov chain), (vii) implies that lambda=1 (and (1,1,...,1) is the corresponding right-hand eigenvector).

Of the claims made earlier, 1 follows from (v) and (vii). To see 2, we have from (vi) that if P^n is strictly positive then all its eigenvalues different to 1 are in modulus less strictly than 1. But if mu is an eigenvalue of P then mu^n is an eigenvalue of P^n. Hence we must have |mu|<1.

As mentioned above, if P is irreducible then Q=(1/m)(I+P+P^2+···+P^{m−1}) must be positive (i.e. a matrix of positive elements). Thus from Perron-Frebonius theory, Q has a largest eigenvalue 1 and all its other eigenvalues are strictly less than 1. From these observations it follows that

Q^n tends to a limit as n tends to infinity.

Notice that a Markov chain with transition matrix Q can be obtained by inspecting our original chain at times 0, 0+Y_1, 0+Y_1+Y_2, 0+Y_1+Y_2+Y_3, ..., where the Y_i are i.i.d. random variables, each being uniformly distributed over the numbers 0,1,...,m−1.