If the channel introduces noise then the output is not a unique function of the input. We will model this case by saying that for every possible input (which are mutually exclusive states indexed by \(i\)) there may be more than one possible output outcome. Which actually happens is a matter of chance, and we will model the channel by the set of probabilities that each of the output events \(B_j\) occurs when each of the possible input events \(A_i\) happens. These transition probabilities \(c_{ji}\) are, of course, probabilities, but they are properties of the channel and do not depend on the probability distribution \(p(A_i)\) of the input. Like all probabilities, they have values between 0 and 1
\(0 \leq c_{ji} \leq 1 \tag{6.15}\)
and may be thought of as forming a matrix with as many columns as there are input events, and as many rows as there are output events. Because each input event must lead to exactly one output event,
\(1 = \displaystyle \sum_{j} c_{ji} \tag{6.16}\)
for each \(i\). (In other words, the sum of \(c_{ji}\) in each column \(i\) is 1.) If the channel is noiseless, for each value of \(i\) exactly one of the various \(c_{ji}\) is equal to 1 and all others are 0.
When the channel is driven by a source with probabilities \(p(A_i)\), the conditional probabilities of the output events, conditioned on the input events, is
\(p(B_j \; | \; A_i) = c_{ji} \tag{6.17}\)
The unconditional probability of each output \(p(B_j)\) is
The simplest noisy channel is the symmetric binary channel, for which there is a (hopefully small) probability \(\epsilon\) of an error, so
This binary channel is called symmetric because the probability of an error for both inputs is the same. If \(\epsilon\) = 0 then this channel is noiseless (it is also noiseless if \(\epsilon\) = 1, in which case it behaves like an inverter). Figure 6.3 can be made more useful for the noisy channel if the possible transitions from input to output are shown, as in Figure 6.5.
If the output \(B_j\) is observed to be in one of its (mutually exclusive) states, can the input \(A_i\) that caused it be determined? In the absence of noise, yes; there is no uncertainty about the input once the output is known. However, with noise there is some residual uncertainty. We will calculate this uncertainty in terms of the transition probabilities \(c_{ji}\) and define the information that we have learned about the input as a result of knowing the output as the mutual information. From that we will define the channel capacity \(C\).