Markov Chains

Tuesday, November 13, 2012

Notation

Some students say that the notation is one of the most difficult things about this course. I recommend that you make for yourself a one-page crib sheet of all the notation:

\begin{align*}
&(X_n)_{n\geq 0},\ I,\ \text{Markov}(\lambda,P),\ P = (p_{ij}),\ P(n) = (p_{ij}^{(n)}),\\[12pt]
&h_i^A,\ k_i^A,\ H_i,\ T_i,\ V_i,\ V_i^{(n)},\ f_i,\ \lambda,\ \pi,\ \gamma_i^k,\ m_i,\ m_{ij} .
\end{align*}
Write a little explanation for yourself as to what each notation means, and how it used in our theorems about right-hand equations, recurrence/transience, left-hand equations, existence/uniqueness of invariant measure, aperiodicity/periodicity, positive/null recurrence and detailed balance. It should all seems pretty straightforward and memorable once you summarise it on one page and make some notes to place it in context.

Of course I could easily typeset a page like this for you — but I think that you'll learn more, and it will be more memorable for you personally, if you create this crib sheet yourself!

Lecture 12

The last lecture was on Tuesday November 13. I have enjoyed giving this course and I hope you have enjoyed it too. Thank you for your feedback.

The notes at the end of the course are slightly improved from those posted at the time of Lecture 1. I thank those of you who have spotted some typos.

You can now also look at the overhead projector slides that I sometimes used in lectures, such as those today that I used to summarise four Part II courses that you may like to study next year: Probability and Measure, Applied Probability, Optimization and Control, and Stochastic Financial Models.

In my discussion of random walk and electrical networks in Section 12.4 I appealed to Rayleigh's Monotonicity Law: " if some resistances of a circuit are increased (decreased) the resistance between any two points of the circuit can only increase (decrease)." A proof of this "obvious" fact can be constructed by (i) proving Thomson's Principle: "Flows determined by Kirchhoff's Laws minimize energy dissipation", and then (ii) showing that Thomson's Principle implies Rayleigh's Monotonicity Law. You can read the details of this in Doyle and Snell Random walks and electric networks, pages 51–52.

In Section 12.2 I mentioned Burke's output theorem (1956) which says that the output process of a $M/M/1$ queue in equilibrium is a Poisson process with the same rate as the input. In writing "$M/M/1$" the "$M$"s mean Markovian (i.e. a Poisson input process of rate $\lambda$ and i.i.d. exponentially distributed service times with parameter $\mu$ (where $\mu > \lambda$), and the "$1$" means a single server. In queueing theory this very useful notation is known as Kendall's notation. For example, a $D/G/m$ queue is one in which arrivals are separated by a deterministic time, there are general service times and $m$ servers working in parallel.)

I remarked (just for fun) that queueing is the only common word in the OED with five vowels in a row. Obscure words are ones like "miaoued" (what the cat did).

I once proved a generalization of Burke's output theorem that holds even when the queue has not reached equilibrium (see: The interchangeability of $\cdot/M/1$ queues in series, Weber, 1979). Suppose we have two single-server first-in-first-out queues in series, which we might write as $\cdot/M/1 \to /M/1$. The customers' service times in the first queue are i.i.d. exponentially distributed with parameter $\lambda$ and in the second queue they are i.i.d. exponentially distributed with parameter $\mu$. On finishing service in the first queue a customer immediately joins the second queue. Suppose the system starts with $N$ customers in the first (upstream) queue and no customers in the second (downstream) queue. My theorem says that all statistics that we might measure about the departure process from the second queue are the same if $\lambda$ and $\mu$ are interchanged. Thus by observing the process of departures from the second queue we cannot figure out which way around the two $/M/1$ servers are ordered. For example, the time at which we see the first departure leave the second queue has expected value $1/\lambda + 1/\mu$ (which is symmetric in $\lambda$ and $\mu$). Its moment generating function is $\lambda\mu(\theta-\lambda)^{-1}(\theta-\mu)^{-1}$. The expected time at which the second departure takes place is
$$
\frac{2 \lambda^2+3 \lambda\mu+2 \mu^2}{\lambda^2 \mu+\lambda \mu^2}.
$$All other statistics are also symmetric in $\lambda$ and $\mu$. It was by computing such quantities that I first formulated the conjecture that this might be true. Burke's theorem is a corollary of my result that can be obtained by thinking about $N$ tending to infinity (can you see how?)

Markov chains have been studied for about 100 years, but there are still interesting things to discover. An important on-going research area is Markov chain Monte Carlo.

Thursday, November 8, 2012

Fast retries lemma

There are many interesting consequences of reversibility. Here is one that I talked about at the end of today's lecture and which has a very practical application.

Theorem. If $P$ is the transition matrix of a reversible Markov chain on $N$ states then for all $i$,
$$
P_i(X_n=i)=p_{ii}^{(n)} = \sum_{k=1}^N a_k \lambda_k^n,
$$where all $a_k\geq 0$ and all $\lambda_k$ are real and $\lambda_k\leq 1$.

Hint. To prove this, let $\pi$ be the invariant distribution and consider $A=VPV^{-1}$, where $V$ is the diagonal matrix with $V_{ii}=\sqrt{\pi_i}$. Show that $A$ is symmetric, and think about what that implies.

Similarly, for the continuous time version of a Markov chain (a so-called Markov process)
$$
P_i(X(t)=i)=p_{ii}(t) = \sum_{k=1}^N a_k e^{-\lambda_k t}.
$$where all $a_k\geq 0$ and all $\lambda_k$ are real and $\lambda_k\geq 0$.

I have used this fact to prove what I call the fast retries lemma (Theorem 1 in the paper Optimal call routing in VoIP). Suppose that you ring up your friend and find that his phone is engaged. There is a minimum time that it takes to retry the call, say $\tau$, so you could try to reach him at times $0,\tau,2\tau,\dotso$ until you get through. But might the expected time to get through be less if you retry at $\rho,2\rho,\dotso$, for some $\rho>\tau$? You might get through on the second attempt rather than the third, and $2\rho< 3\tau$.

It turns out the answer is no. Your best policy is to attempt contact at $0,\tau,2\tau,\dotso$, provided that the state of your friend can be modelled as a state of a reversible Markov process, and that you get through to him as soon as you ring and find that he is not in the state (engaged) which he was at time $0$. For example, your friend might be performing a random walk on a graph whose vertices are "engaged on the phone", "cooking", "sleeping", "watching tv", etc.

The expected time to connect is $ET$ where
$$
ET = t + p_{00}(t)ET = \frac{t}{1-p_{00}(t)}.
$$
So the lemma is saying that, subject to the constraint $t\geq \tau$, $ET$ is minimized by $ t=\tau$.