7.10: Markov Modulated Random Walks

Last updated
Save as PDF

Page ID: 77037

Robert Gallager
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Frequently it is useful to generalize random walks to allow some dependence between the variables being summed. The particular form of dependence here is the same as the Markov reward processes of Section 3.5. The treatment in Section 3.5 discussed only expected rewards, whereas the treatment here focuses on the random variables themselves. Let {\(Y_m; m ≥ 0\)} be a sequence of (possibly dependent) rv’s, and let

\[ \{ S_n; n\geq 1\} \qquad \text{where } S_n=\sum^{n-1}_{m=0} Y_m \nonumber \]

be the process of successive sums of these random variables. Let {\(X_n; n ≥ 0\)} be a Markov chain, and assume that each \(Y_n\) can depend on \(X_n\) and \(X_{n+1}\). Conditional on \(X_n\) and \(X_{n+1}\), however, \(Y_n\) is independent of \(Y_{n−1}, . . . , Y_1\), and of \(X_i\) for all \(i \neq n, n−1\). Assume that \(Y_n\), conditional on \(X_n\) and \(X_{n+1}\) has a distribution function \(F_{ij} \ref{y} = \text{Pr}\{Y_n ≤ y | X_n = i, X_{n+1} = j\}\). Thus each rv \(Y_n\) depends only on the associated transition in the Markov chain, and this dependence is the same for all \(n\).

The process {\(S_n; n ≥ 1\)} is called a Markov modulated random walk. If each \(Y_m\) is positive, it is also the sequence of epochs in a semi-Markov process. For each \(m\), \(Y_m\) is associated with the transition in the Markov chain from time \(m\) to \(m + 1\), and \( S_n\) is the aggregate reward up to but not including time \(n\). Let \(\overline{Y}_{ij}\) denote \(\mathsf{E} [Y_n | X_n = i, X_{n+1} = j]\) and \(\overline{Y}_i\) denote \(\mathsf{E} [Y_n | X_n = i]\). Let {\(P_{ij} \)} be the set of transition probabilities for the Markov chain, so \(\overline{Y}_i = \sum_j P_{ij} \overline{Y}_{ ij} \). We may think of the process {\(Y_n; n ≥ 0\)} as evolving along with the Markov chain. The distributions of the variables \(Y_n\) are associated with the transitions from \(X_n\) to \(X_{n+1}\), but the \(Y_n\) are otherwise independent random variables.

In order to define a martingale related to the process {\(S_n; n ≥ 1\)}, we must subtract the mean reward from {\(S_n\)} and must also compensate for the effect of the state of the Markov chain. The appropriate compensation factor turns out to be the relative-gain vector defined in Section 3.5.

For simplicity, consider only finite-state irreducible Markov chains with \(\mathsf{M}\) states. Let \(π = (π_1, . . . , π_{\mathsf{M}})\) be the steady-state probability vector for the chain, let \(\overline{\boldsymbol{Y}} = (\overline{Y}_1, . . . , \overline{Y}_{\mathsf{M}} )^{\mathsf{T}}\) be the vector of expected rewards, let \(g = π\overline{Y}\) be the steady-state gain per unit time, and let \(\boldsymbol{w} = (w_1, . . . , w_{\mathsf{M}})^{\mathsf{T}}\) be the relative-gain vector. From Theorem 3.5.1, \(\boldsymbol{w}\) is the unique solution to

\[ \boldsymbol{w} +ge = \overline{\boldsymbol{Y}} + [P]\boldsymbol{w} \quad ; \quad w_1 =0 \nonumber \]

We assume a fixed starting state \(X_0 = k\). As we now show, the process \(Z_n; n ≥ 1\) given by

\[Z_n = S_n -ng +w_{X_n}-w_k \quad ; \quad n\geq 1 \nonumber \]

is a martingale. First condition on a given state, \(X_{n−1} = i\).

\[ \mathsf{E} [Z_n| Z_{n-1},Z_{n-2},...,Z_1, X_{n-1} =i ] \nonumber \]

Since \(S_n = S_{n−1} + Y_{n−1}\), we can express \(Z_n\) as

\[ Z_n = Z_{n-1} + Y_{n-1}-g+w_{X_n}-w_{X_{n-1}} \nonumber \]

Since \(\mathsf{E} [Y_{n−1} | X_{n−1} = i] = \overline{Y}_i\) and \(\mathsf{E} [w_{X_n} | X_{n−1} = i] = \sum_j P_{ij} w_j \), we have

\[ \mathsf{E} [Z_n|Z_{n-1},Z_{n-2},...,Z_1,X_{n-1} =i] = Z_{n-1}+\overline{Y}_i -g+\sum_j P_{ij}w_j-w_i \nonumber \]

From (7.10.2) the final four terms in (7.10.6) sum to 0, so

\[ \mathsf{E} [Z_n|Z_{n-1},...,Z_1,X_{n-1} =i]=Z_{n-1} \nonumber \]

Since this is valid for all choices of \( X_{n−1}\), we have \(\mathsf{E} [Z_n | Z_{n−1}, . . . , Z_1] = Z_{n−1}\). Since the expected values of all the reward variables \(\overline{Y}_i\) exist, we see that \(\mathsf{E} [|Y_n|] <\infty\), so that \(\mathsf{E} [|Z_n|] < \infty\) also. This verifies that {\(Z_n; n ≥ 1\)} is a martingale. It can be verified similarly that \(\mathsf{E} [Z_1] = 0\), so \(\mathsf{E} [Z_n] = 0\) for all \(n ≥ 1\).

In showing that {\(Z_n; n ≥ 1\)} is a martingale, we actually showed something a little stronger. That is, (7.10.7) is conditioned on \(X_{n−1}\) as well as \(Z_{n−1}, . . . , Z_1\). In the same way, it follows that for all \(n > 1\),

\[\mathsf{E}[Z_n|Z_{n-1},X_{n-1},Z_{n-2},X_{n-2},...,Z_1,X_1 ] = Z_{n-1} \nonumber \]

In terms of the gambling analogy, this says that {\(Z_n; n ≥ 1\)} is fair for each possible past sequence of states. A martingale {\(Z_n; n ≥ 1\)} with this property (\(i.e.\), satisfying (7.10.8)) is said to be a martingale relative to the joint process {\(Z_n, X_n; n ≥ 1\)}. We will use this martingale later to discuss threshold crossing problems for Markov modulated random walks. We shall see that the added property of being a martingale relative to {\(Z_n, X_n\)}gives us added flexibility in defining stopping times.

As an added bonus to this example, note that if {\(X_n; n ≥ 0\)} is taken as the embedded chain of a Markov process (or semi-Markov process), and if \(Y_n\) is taken as the time interval from transition \(n\) to \(n + 1\), then \(S_n\) becomes the epoch of the \(n\)th transition in the process.

Generating functions for Markov random walks

Consider the same Markov chain and reward variables as in the previous example, and assume that for each pair of states, \(i, \, j\), the moment generating function

\[ \mathsf{g}_{ij}(r) = \mathsf{E}[\exp(rY_n)|X_n=i,X_{n+1} =j ] \nonumber \]

exists over some open interval (\(r_−, r_+\)) containing 0. Let \([Γ(r)]\) be the matrix with terms \(P_{ij} \mathsf{g}_{ij} (r)\). Since \([Γ(r)]\) is an irreducible nonnegative matrix, Theorem 3.4.1 shows that \([Γ(r)]\) has a largest real eigenvalue, \(ρ(r) > 0\), and an associated positive right eigenvector, \(\nu \ref{r} = (\nu_1(r), . . . , \nu_{\mathsf{M}}(r))^{\mathsf{T}}\) that is unique within a scale factor. We now show that the process {\(M_n(r); n ≥ 1\)} defined by

\[ M_n(r) = \dfrac{\exp (rS_n)\nu_{X_n}(r) }{\rho (r)^n\nu_k(r)} \nonumber \]

is a product type Martingale for each \(r ∈ (r_−, r_+)\). Since \(S_n = S_{n−1} + Y_{n−1}\), we can express \(M_n(r)\) as

\[ M_n(r)=M_{n-1}(r) \dfrac{\exp (rY_{n-1}) \nu_{X_n} (r)}{\rho \ref{r} \nu_{X_{n-1}} (r)} \nonumber \]

The expected value of the ratio in (7.10.11), conditional on \(X_{n−1} = i\), is

\[ E \left[ \dfrac{\exp (rY_{n-1})\nu_{X_n} (r)}{\rho (r)\nu_i (r)} | X_{n-1}=i \right] \quad = \quad \dfrac{\sum_{j} P_{ij} \mathsf{g}_{ij} \ref{r} \nu_j (r)}{\rho \ref{r} \nu_i (r)} =1 \nonumber \]

where, in the last step, we have used the fact that \(\nu (r)\) is an eigenvector of \([Γ(r)]\). Thus, \(\mathsf{E} [M_n(r) | M_{n-1}(r), . . . , M_1(r), X_{n-1} = i] = M_{n−1}(r)\). Since this is true for all choices of \(i\), the condition on \(X_{n−1} = i\) can be removed and {\(M_n(r); n ≥ 1\)} is a martingale. Also, for \(n > 1\),

\[ \mathsf{E} [M_n \ref{r} | M_{n-1} \ref{r} , X_{n-1}, ..., M_1(r), X_1] = M_{n-1} \ref{r} \nonumber \]

so that {\(M_n(r); n ≥ 1\)} is also a martingale relative to the joint process {\(M_n(r), X_n; n ≥ 1\)}.

It can be verified by the same argument as in (7.10.12) that \(\mathsf{E} [M_1(r)] = 1\). It then follows that \(\mathsf{E} [M_n(r)] = 1\) for all \(n ≥ 1\).

One of the uses of this martingale is to provide exponential upper bounds, similar to (7.18), to the probabilities of threshold crossings for Markov modulated random walks. Define

\[ \widetilde{M}_n(r) = \dfrac{\exp (rS_n)\min_j (\nu_j (r))}{\rho(r)^n\nu_k(r) } \nonumber \]

Then \(\widetilde{M}_n(r) ≤ M_n(r)\), so \(\mathsf{E}\left[ \widetilde{M}_n(r) \right] ≤ 1\). For any \(μ > 0\), the Markov inequality can be applied to \(\widetilde{M}_n (r)\) to get

\[ \text{Pr}\left\{ \widetilde{M}_n(r) \geq \mu \right\} \leq \dfrac{1}{\mu} \mathsf{E} \left[ \widetilde{M}_n \ref{r} \right] \leq \dfrac{1}{\mu} \nonumber \]

For any given \(α\), and any given \(r\), \(0 ≤ r < r_+\), we can choose \(μ = \exp(rα)ρ(r)^{−n} \min_j (\nu_j (r))/\nu_k(r)\), and for \(r > 0\). Combining (7.10.14) and (7.10.15),

\[ \text{Pr} \{ S_n \geq \alpha \} \leq \rho (r)^n\exp(-r\alpha )\nu_k(r)/\min_j(\nu_j (r)) \nonumber \]

This can be optimized over \(r\) to get the tightest bound in the same way as (7.18).

stopping trials for martingales relative to a process

A martingale {\(Z_n; n ≥ 1\)} relative to a joint process {\(Z_n, X_n; n ≥ 1\)} was defined as a martingale for which (7.10.8) is satisfied, \(i.e.\), \(\mathsf{E} [Z_n | Z_{n−1}, X_{n−1}, . . . , Z_1, X_1] = Z_{n−1}\). In the same way, we can define a submartingale or supermartingale {\(Z_n; n ≥ 1\)} relative to a joint process {\(Z_n, X_n; n ≥ 1\)} as a submartingale or supermartingale satisfying (7.10.8) with the \(=\) sign replaced by \(≥\) or \(≤\) respectively. The purpose of this added complication is to make it easier to define useful stopping rules.

As generalized in Definition 4.5.2, a generalized stopping trial \(\mathbb{I}_{J≥n}\) for a sequence of pairs of rv’s \((Z_1, X_1), (Z_2, X_2) . . . ,\) is a positive integer-valued rv such that, for each \(n ≥ 1, \mathbb{I}_{\{J=n\}}\) is a function of \(Z_1, X_1, Z_2, X_2, . . . , Z_n, X_n\).

Theorems 7.8.1, 7.8.2 and Lemma 7.8.1 all carry over to martingales (submartingales or supermartingales) relative to a joint process. These theorems are stated more precisely in Exercises 7.23 to 7.26. To summarize them here, assume that {\(Z_n; n ≥ 1\)} is a martingale (submartingale or supermartingale) relative to a joint process {\(Z_n, X_n; n ≥ 1\)} and assume that \(J\) is a stopping trial for {\(Z_n; n ≥ 1\)} relative to {\(Z_n, X_n; n ≤ 1\)}. Then the stopped process is a martingale (submartingale or supermartingale) respectively, (7.8.3 — 7.8.5) are satisfied, and, for a martingale, \(\mathsf{E}[Z_J ] = \mathsf{E} [Z_1]\) is satisfied iff (7.8.9) is satisfied.

Markov modulated random walks with thresholds

We have now developed two martingales for Markov modulated random walks, both conditioned on a fixed initial state \(X_0 = k\). The first, given in (7.10.3), is {\(Z_n = S_n − ng + w_{X_n} − w_k; n ≥ 1\)}. Recall that \(\mathsf{E} [Z_n] = 0\) for all \(n ≥ 1\) for this martingale. Given two thresholds, \(α> 0\) and \(β< 0\), define \(J\) as the smallest \(n\) for which \(S_n ≥ α\) or \(S_n ≤ β\). The indicator function \(\mathbb{I}_{J=n}\) of \(\{J = n\}\), is 1 iff \(β< S_i < α\) for \(1 ≤ i ≤ n − 1\) and either \(S_n ≥ α\) or \(S_n ≤ β\). Since \(S_i = Z_i + ig − w_{X_i} + w_k, \, S_i\) is a function of \( Z_i\) and \(X_i\), so the stopping trial is a function of both \(Z_i\) and \(X_i\) for \(1 ≤ i ≤ n\). It follows that \(J\) is a stopping trial for {\(Z_n; n ≥ 1\)}relative to {\(Z_n, X_n; n ≥ 1\)}. From Lemma 7.8.1, we can assert that \(\mathsf{E} [Z_J ] = \mathsf{E} [Z_1] = 0\) if (7.8.9) is satisfied, \(i.e.\), if \(\lim_{n→\infty} \mathsf{E} [Z_n | J > n] \text{Pr}\{J > n\} = 0\) is satisfied. Using the same argument as in Lemma 7.5.1, we can see that \(\text{Pr}\{ J > n\}\) goes to 0 at least geometrically in \(n\). Conditional on \(J > n\), \(β< S_n < α\), so \(S_n\) is bounded independent of \(n\). Also \(w_{X_n}\) is bounded, since the chain is finite state, and \(ng\) is linear in \(n\). Thus \(\mathsf{E} [Z_n | J > n]\) varies at most linearly with \(n\), so (7.8.9) is satisfied, and

\[ 0 = \mathsf{E}[Z_J]=\mathsf{E}[S_J]-\mathsf{E}[J]g+\mathsf{E}[w_{X_n}] -w_k \nonumber \]

Recall that Wald’s equality for random walks is \(\mathsf{E} [S_J ] = \mathsf{E} [J] \overline{X}\). For Markov modulated random walks, this is modified, as shown in (7.10.17), by the relative-gain vector terms.

The same arguments can be applied to the generating function martingale of (7.10.10). Again, let \( J\) be the smallest \(n\) for which \(S_n ≥ α\) or \(S_n ≤ β\). As before, \(S_i\) is a function of \(M_i(r)\) and \(X_i\), so \(\mathbb{I}_n\) is a function of \(M_i(r)\) and \(X_i\) for \(1 ≤ i ≤ n − 1\). It follows that \(J\) is a stopping trial for {\(M_n(r); n ≥ 1\)} relative to {\(M_n(r), X_n; n ≥ 1\)} Next we need the following lemma:

Lemma 7.10.1. For the martingale {\(M_n(r); n ≥ 1\)} relative to {\(M_n(r), X_n; n ≥ 1\)} defined in (7.10.10), where {\(X_n; n ≥ 0\)} is a finite-state Markov chain, and for the above stopping trial \(J\),

\[ \lim_{n\rightarrow \infty} \mathsf{E}[M_n \ref{r} | J>n] \text{Pr}\{ J>n\} =0 \nonumber \]

Proof: From lemma 4, slightly modified for the case here, there is a \(δ> 0\) such that for all states \(i\), \(j\), and all \(n > 1\) such that \(\mathsf{Pr}\{J = n, X_{n−1} = i, X_n = j\} > 0\),

\[ \mathsf{E}[\exp(rS_n|J=n,X_{n-1}=i,X_n=j] \geq \delta \nonumber \]

Since the stopped process, {\(M^∗_n(r); n ≥ 1\)}, is a martingale, we have for each \(m\),

\[ 1=\mathsf{E}[M_m^*(r)] \geq \sum^m_{n=1} \dfrac{\mathsf{E}[\exp (rS_n)\nu_{X_n}(r) | J=n]}{\rho(r)^n\nu_k(r)} \nonumber \]

From (7.10.19), we see that there is some \(δ' > 0\) such that

\[ \mathsf{E}[\exp(rS_n)\nu_{X_n}(r)]/\nu_k(r)|J=n] \geq \delta' \nonumber \]

for all \( n\) such that \(\text{Pr}\{J = n\} > 0\). Thus (7.10.20) is bounded by

\[ 1\geq \delta' \sum_{n\leq m}\rho(r)^n\text{Pr}\{J=n\} \nonumber \]

Since this is valid for all \(m\), it follows by the argument in the proof of theorem 7.5.1 that \(\lim_{n→\infty} ρ(r)^n\text{Pr}\{J > n\} = 0\). This, along with (7.10.19), establishes (7.10.18), completing the proof.

\(\square \)

From Lemma 7.8.1, we have the desired result:

\[ \mathsf{E}[M_J(r)] =\mathsf{E} \left[ \dfrac{\exp (rS_J)\nu_{X_J} (r)}{[\rho(r)]^J\nu_j (r)} \right] =1 ; \qquad r_-<r<r_+ \nonumber \]

This is the extension of the Wald identity to Markov modulated random walks, and is used in the same way as the Wald identity. As shown in Exercise 7.28, the derivative of (7.10.21), evaluated at \(r = 0\), is the same as (7.10.17).