5.3: Reversible Markov Chains

Last updated
Save as PDF

Page ID: 44624

Robert Gallager
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Many important Markov chains have the property that, in steady state, the sequence of states looked at backwards in time, i.e.,. . . \(X_{n+1}, X_{n}, X_{n-1}, \ldots\), has the same probabilistic structure as the sequence of states running forward in time. This equivalence between the forward chain and backward chain leads to a number of results that are intuitively quite surprising and that are quite dicult to derive without using this equivalence. We shall study these results here and then extend them in Chapter 6 to Markov processes with a discrete state space. This set of ideas, and its use in queueing and queueing networks, has been an active area of queueing research over many years . It leads to many simple results for systems that initially look very complex. We only scratch the surface here and refer the interested reader to [13] for a more comprehensive treatment. Before going into reversibility, we describe the backward chain for an arbitrary Markov chain.

The defining characteristic of a Markov chain \(\left\{X_{n} ; n \geq 0\right\}\) is that for all \(n \geq 0\),

\[\operatorname{Pr}\left\{X_{n+1} \mid X_{n}, X_{n-1}, \ldots, X_{0}\right\}=\operatorname{Pr}\left\{X_{n+1} \mid X_{n}\right\}\label{5.29} \]

For homogeneous chains, which we have been assuming throughout, \(\operatorname{Pr}\left\{X_{n+1}=j \mid X_{n}=i\right\}=P_{ij}\), independent of \(n\). For any \(k>1\), we can extend \ref{5.29} to get

\[\begin{aligned}
\operatorname{Pr} &\left\{X_{n+k}, X_{n+k-1}, \ldots, X_{n+1} \mid X_{n}, X_{n-1}, \ldots, X_{0}\right\} \\
&=\operatorname{Pr}\left\{X_{n+k} \mid X_{n+k-1}\right\} \operatorname{Pr}\left\{X_{n+k-1} \mid X_{n+k-2}\right\} \ldots \operatorname{Pr}\left\{X_{n+1} \mid X_{n}\right\} \\
&=\operatorname{Pr}\left\{X_{n+k}, X_{n+k-1}, \ldots, X_{n+1} \mid X_{n}\right\}
\end{aligned}\label{5.30} \]

By letting \(A^{+}\) be any event defined on the states \(X_{n+1}\) to \(X_{n+k}\) and letting \(A^{-}\) be any event defined on \(X_{0}\) to \(X_{n-1}\), this can be written more succinctly as

\[\operatorname{Pr}\left\{A^{+} \mid X_{n}, A^{-}=\operatorname{Pr}\left\{A^{+} \mid X_{n}\right.\right.\label{5.31} \]

This says that, given state \(X_{n}\), any future event \(A^{+}\) is statistically independent of any past event \(A^{-}\). This result, namely that past and future are independent given the present state, is equivalent to \ref{5.29} for defining a Markov chain, but it has the advantage of showing the symmetry between past and future. This symmetry is best brought out by multiplying both sides of \ref{5.31} by \(\operatorname{Pr}\left\{A^{-} \mid X_{n}\right\}\), obtaining²

\[\operatorname{Pr}\left\{A^{+}, A^{-} \mid X_{n}=\operatorname{Pr}\left\{A^{+} \mid X_{n} \operatorname{Pr}\left\{A^{-} \mid X_{n}\right.\right.\right.\label{5.32} \]

This symmetric form says that, conditional on the current state, the past and future states are statistically independent. Dividing both sides by \(\operatorname{Pr}\left\{A^{+} \mid X_{n}\right\}\) then yields

\[\operatorname{Pr}\left\{A^{-} \mid X_{n}, A^{+}=\operatorname{Pr}\left\{A^{-} \mid X_{n}\right.\right.\label{5.33} \]

By letting \(A^{-}\) be \(X_{n-1}\) and \(A^{+}\) be \(X_{n+1}, X_{n+2}, \ldots, X_{n+k}\), this becomes

\(\operatorname{Pr}\left\{X_{n-1} \mid X_{n}, X_{n+1}, \ldots, X_{n+k}\right\}=\operatorname{Pr}\left\{X_{n-1} \mid X_{n}\right\}\)

This is the equivalent form to \ref{5.29} for the backward chain, and says that the backward chain is also a Markov chain. By Bayes’ law, \(\operatorname{Pr}\left\{X_{n-1} \mid X_{n}\right\}\) can be evaluated as

\[\operatorname{Pr}\left\{X_{n-1} \mid X_{n}\right\}=\frac{\operatorname{Pr}\left\{X_{n} \mid X_{n-1}\right\} \operatorname{Pr}\left\{X_{n-1}\right\}}{\operatorname{Pr}\left\{X_{n}\right\}}\label{5.34} \]

Since the distribution of \(X_{n}\) can vary with \(n\), \(\operatorname{Pr}\left\{X_{n-1} \mid X_{n}\right\}\) can also depend on \(n\). Thus the backward Markov chain is not necessarily homogeneous. This should not be surprising, since the forward chain was defined with some arbitrary distribution for the initial state at time 0. This initial distribution was not relevant for equations \ref{5.29} to (5.31), but as soon as \(\operatorname{Pr}\left\{A^{-} \mid X_{n}\right\}\) was introduced, the initial state implicitly became a part of each equation and destroyed the symmetry between past and future. For a chain in steady state, however, \(\operatorname{Pr}\left\{X_{n}=j\right\}=\operatorname{Pr}\left\{X_{n-1}=j\right\}=\pi_{j}\) for all \(j\), and we have

\[\operatorname{Pr}\left\{X_{n-1}=j \mid X_{n}=i\right\}=P_{j i} \pi_{j} / \pi_{i}\label{5.35} \]

Thus the backward chain is homogeneous if the forward chain is in steady state. For a chain with steady-state probabilities \(\left\{\pi_{i} ; i \geq 0\right\}\), we define the backward transition probabilities \(P_{i j}^{*}\) as

\[\pi_{i} P_{i j}^{*}=\pi_{j} P_{j i}\label{5.36} \]

From (5.34), the backward transition probability \(P_{i j}^{*}\), for a Markov chain in steady state, is then equal to \(\operatorname{Pr}\left\{X_{n-1}=j \mid X_{n}=i\right\}\), the probability that the previous state is \(j\) given that the current state is \(i\).

Now consider a new Markov chain with transition probabilities \(\left\{P_{i j}^{*}\right\}\). Over some segment of time for which both this new chain and the old chain are in steady state, the set of states generated by the new chain is statistically indistinguishable from the backward running sequence of states from the original chain. It is somewhat simpler, in talking about forward and backward running chains, however, to visualize Markov chains running in steady state from \(t=-\infty\) to \(t=+\infty\). If one is uncomfortable with this, one can also visualize starting the Markov chain at some very negative time with the initial distribution equal to the steady-state distribution.

Definition 5.3.1

A Markov chain that has steady-state probabilities \(\left\{\pi_{i} ; i \geq 0\right\}\) is reversible if \(P_{i j}=\pi_{j} P_{j i} / \pi_{i}\) for all \(i, j\), i.e., if \(P_{i j}^{*}=P_{i j}\) for all \(i\), \(j\).

Thus the chain is reversible if, in steady state, the backward running sequence of states is statistically indistinguishable from the forward running sequence. Comparing \ref{5.36} with the steady-state equations \ref{5.25} that we derived for birth-death chains, we have the important theorem:

Theorem 5.3.1

Every birth-death chain with a steady-state probability distribution is reversible.

We saw that for birth-death chains, the equation \(\pi_{i} P_{i j}=\pi_{j} P_{j i}\) (which only had to be considered for \(|i-j| \leq 1\)) provided a very simple way of calculating the steady-state probabilities. Unfortunately, it appears that we must first calculate the steady-state probabilities in order to show that a chain is reversible. The following simple theorem gives us a convenient escape from this dilemma.

Theorem 5.3.2

Assume that an irreducible Markov chain has transition probabilities \(\left\{P_{i j}\right\}\). Suppose \(\left\{\pi_{i}\right\}\) is a set of positive numbers summing to 1 and satisfying

\[\pi_{i} P_{i j}=\pi_{j} P_{j i} ; \quad \text { all } i, j\label{5.37} \]

then, first, \(\left\{\pi_{i} ; i \geq 0\right\}\) is the steady-state distribution for the chain, and, second, the chain is reversible.

Proof

Given a solution to \ref{5.37} for all \(i\) and \(j\), we can sum this equation over \(i\) for each \(j\).

\[\left.\left.\sum_{i}\right\} \pi_{i} P_{i j}=\pi_{j} \sum_{i}\right\} P_{j i}=\pi_{j}\label{5.38} \]

Thus the solution to (5.37), along with the constraints \(\pi_{i}>0\), \(\sum_{i} \pi_{i}=1\), satisfies the steady-state equations, (5.18), and, from Theorem 5.1.4, this is the unique steady-state distribution. Since \ref{5.37} is satisfied, the chain is also reversible.

It is often possible, sometimes by using an educated guess, to find a solution to (5.37). If this is successful, then we are assured both that the chain is reversible and that the actual steady-state probabilities have been found.

Note that the theorem applies to periodic chains as well as to aperiodic chains. If the chain is periodic, then the steady-state probabilities have to be interpreted as average values over the period, but from Theorem 5.1.4 shows that \ref{5.38} still has a unique solution (assuming an irreducible chain). On the other hand, for a chain with period \(d>1\), there are \(d\) subclasses of states and the sequence \(\left\{X_{n}\right\}\) must rotate between these classes in a fixed order. For this same order to be followed in the backward chain, the only possibility is \(d=2\). Thus periodic chains with periods other than 2 cannot be reversible.

There are several simple tests that can be used to show that some given irreducible chain is not reversible. First, the steady-state probabilities must satisfy \(\pi_{i}>0\) for all \(i\), and thus, if \(P_{i j}>0\) but \(P_{j i}=0\) for some \(i, j\), then \ref{5.37} cannot be satisfied and the chain is not reversible. Second, consider any set of three states, \(i, j, k\). If \(P_{j i} P_{i k} P_{k j}\) is unequal to \(P_{j k} P_{k i} P_{i j}\) then the chain cannot be reversible. To see this, note that \ref{5.37} requires that

\(\pi_{i}=\pi_{j} P_{j i} / P_{i j}=\pi_{k} P_{k i} / P_{i k}\)

Thus, \(\pi_{j} P_{j i} P_{i k}=\pi_{k} P_{k i} P_{i j}\). Equation \ref{5.37} also requires that \(\pi_{j} P_{j k}=\pi_{k} P_{k j}\). Taking the ratio of these equations, we see that \(P_{j i} P_{i k} P_{k j}=P_{j k} P_{k i} P_{i j}\). Thus if this equation is not satisfied, the chain cannot be reversible. In retrospect, this result is not surprising. What it says is that for any cycle of three states, the probability of three transitions going around the cycle in one direction must be the same as the probability of going around the cycle in the opposite (and therefore backwards) direction.

It is also true (see [16] for a proof), that a necessary and sucient condition for a chain to be reversible is that the product of transition probabilities around any cycle of arbitrary length must be the same as the product of transition probabilities going around the cycle in the opposite direction. This doesn’t seem to be a widely useful way to demonstrate reversibility.

There is another result, generalizing Theorem 5.3.2, for finding the steady-state probabilities of an arbitrary Markov chain and simultaneously finding the transition probabilities of the backward chain.

Theorem 5.3.3

Assume that an irreducible Markov chain has transition probabilities \(\left\{P_{i j}\right\}\). Suppose \(\left\{\pi_{i}\right\}\) is a set of positive numbers summing to 1 and that \(\left\{P_{i j}^{*}\right\}\) is a set of transition probabilities satisfying

\[\pi_{i} P_{i j}=\pi_{j} P_{j i}^{*} ; \quad \text { all } i, j\label{5.39} \]

Then \(\left\{\pi_{i}\right\}\) is the steady-state distribution and \(\left\{P_{i j}^{*}\right\}\) is the set of transition probabilities for the backward chain.

Proof

Summing \ref{5.39} over \(i\), we get the steady-state equations for the Markov chain, so the fact that the given \(\left\{\pi_{i}\right\}\) satisfy these equations asserts that they are the steady-state probabilities. Equation \ref{5.39} then asserts that \(\left\{P_{i j}^{*}\right\}\) is the set of transition probabilities for the backward chain.

The following two sections illustrate some important applications of reversibility.

²Much more broadly, any 3 events, say \(A^{-}, X_{0}, A^{+}\) are said to be Markov if \(\operatorname{Pr}\{A^{+} \mid X_{0} A^{-}=\operatorname{Pr}\{A^+|X_0\), and this implies the more symmetric form \(\operatorname{Pr}\left\{A^{-} A^{+} \mid X_{0}\right)=\operatorname{Pr}\left\{A^{-} \mid X_{0} \operatorname{Pr}\left\{A^{+} \mid X_{0}\right.\right.\).