6.21: Source Coding Theorem

Last updated
Save as PDF

Page ID: 1873

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

Learning Objectives

The Source Coding Theorem states that the entropy of an alphabet of symbols specifies to within one bit how many bits on the average need to be used to send the alphabet.

The significance of an alphabet's entropy rests in how we can represent it with a sequence of bits. Bit sequences form the "coin of the realm" in digital communications: they are the universal way of representing symbolic-valued signals. We convert back and forth between symbols to bit-sequences with what is known as a codebook: a table that associates symbols to bit sequences. In creating this table, we must be able to assign a unique bit sequence to each symbol so that we can go between symbol and bit sequences without error.

Note

You may be conjuring the notion of hiding information from others when we use the name codebook for the symbol-to-bit-sequence table. There is no relation to cryptology, which comprises mathematically provable methods of securing information. The codebook terminology was developed during the beginnings of information theory just after World War II.

As we shall explore in some detail elsewhere, digital communication is the transmission of symbolic-valued signals from one place to another. When faced with the problem, for example, of sending a file across the Internet, we must first represent each character by a bit sequence. Because we want to send the file quickly, we want to use as few bits as possible. However, we don't want to use so few bits that the receiver cannot determine what each character was from the bit sequence. For example, we could use one bit for every character: File transmission would be fast but useless because the codebook creates errors. Shannon proved in his monumental work what we call today the Source Coding Theorem. Let B(a_k) denote the number of bits used to represent the symbol a_k. The average number of bits \[\overline{B(A)} \nonumber \] required to represent the entire alphabet equals

\[\sum_{k=1}^{K}B(a_{k})Pr[a_{k}] \nonumber \]

The Source Coding Theorem states that the average number of bits needed to accurately represent the alphabet need only to satisfy

\[H(A)\leq \overline{B(A)}\leq H(A)+1 \nonumber \]

Thus, the alphabet's entropy specifies to within one bit how many bits on the average need to be used to send the alphabet. The smaller an alphabet's entropy, the fewer bits required for digital transmission of files expressed in that alphabet.

Example \(\PageIndex{1}\):

A four-symbol alphabet has the following probabilities.

\[Pr[a_{0}]=\frac{1}{2} \nonumber \]

\[Pr[a_{1}]=\frac{1}{4} \nonumber \]

\[Pr[a_{2}]=\frac{1}{8} \nonumber \]

\[Pr[a_{3}]=\frac{1}{8} \nonumber \]

and an entropy of 1.75 bits. Let's see if we can find a codebook for this four-letter alphabet that satisfies the Source Coding Theorem. The simplest code to try is known as the simple binary code: convert the symbol's index into a binary number and use the same number of bits for each symbol by including leading zeros where necessary.

\[a_{0}\leftrightarrow 00a_{1}\leftrightarrow 01a_{2}\leftrightarrow 10a_{3}\leftrightarrow 11 \nonumber \]

Whenever the number of symbols in the alphabet is a power of two (as in this case), the average number of bits

\[\overline{B(A)}=\log_{2}K \nonumber \]

which equals 2 in this case. Because the entropy equals 1.75 bits, the simple binary code indeed satisfies the Source Coding Theorem—we are within one bit of the entropy limit—but you might wonder if you can do better. If we choose a codebook with differing number of bits for the symbols, a smaller average number of bits can indeed be obtained. The idea is to use shorter bit sequences for the symbols that occur more often. One codebook like this is

\[a_{0}\leftrightarrow 0a_{1}\leftrightarrow 01a_{2}\leftrightarrow 110a_{3}\leftrightarrow 111 \nonumber \]

\[\overline{B(A)}=1\cdot \times \frac{1}{2}+2\cdot \times \frac{1}{4}+3\cdot \times \frac{1}{8}+3\cdot \times \frac{1}{8}=1.75 \nonumber \]

We can reach the entropy limit! The simple binary code is, in this case, less efficient than the unequal-length code. Using the efficient code, we can transmit the symbolic-valued signal having this alphabet 12.5% faster. Furthermore, we know that no more efficient codebook can be found because of Shannon's Theorem.