6.3: Source Coding Theorem

Last updated
Save as PDF

Page ID: 50191

Paul Penfield, Jr.
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Now getting back to the source model, note that the codewords have an average length, in bits per symbol,

\(L = \displaystyle \sum_{i} p(A_i)L_i \tag{6.11}\)

For maximum speed the lowest possible average codeword length is needed. The assignment of high-probability symbols to the short codewords can help make \(L\) small. Huffman codes are optimal codes for this purpose. However, there is a limit to how short the average codeword can be. Specifically, the Source Coding Theorem states that the average information per symbol is always less than or equal to the average length of a codeword:

\(H \leq L \tag{6.12}\)

This inequality is easy to prove using the Gibbs and Kraft inequalities. Use the Gibbs inequality with \(p'(A_i) = 1/2^{L_i}\) (the Kraft inequality assures that the \(p'(A_i)\), besides being positive, add up to no more than 1). Thus

Screen Shot 2021-05-05 at 10.59.09 PM.png — Figure 6.3: Binary Channel

\[\begin{align*} H \quad &= \quad \displaystyle \sum_{i} p(A_i)\log_2\Big(\dfrac{1}{p(A_i)}\Big) \\ &\leq \quad\displaystyle \sum_{i} p(A_i)\log_2\Big(\dfrac{1}{p'(A_i)}\Big) \\ &= \quad \displaystyle \sum_{i} p(A_i)\log_2 2^{L_i} \\ &= \quad\sum_{i} p(A_i)L_i \\ &= \quad L \tag{6.13} \end{align*} \]

The Source Coding Theorem can also be expressed in terms of rates of transmission in bits per second by multiplying Equation 6.12 by the symbols per second \(R\):

\(H \ R \leq L \ R \tag{6.14}\)