11.2: Principle of Maximum Entropy for Physical Systems

Last updated
Save as PDF

Page ID: 50223

Paul Penfield, Jr.
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

According to the multi-state model motivated by quantum mechanics (see Chapter 10 of these notes) there are a finite (or countable infinite) number of quantum states of the system. We will use \(i\) as an index over these states. The states have energy \(E_i\), and might have other physical attributes as well. After these states are enumerated and described, the Principle of Maximum Entropy can be used, as a separate step, to estimate how likely each state is to be occupied.

We denote the occupancy of state \(i\) by the event \(A_i\). The state i has probability \(p(A_i)\) of being occupied. For simplicity we will write this probability \(p(A_i)\) as \(p_i\). We use the Principle of Maximum Entropy to estimate the probability distribution \(p_i\) consistent with the average energy \(E\) being a known (for example, measured) quantity \(\widetilde{E}\). Thus

\(\begin{align*} \widetilde{E} &= \displaystyle \sum_{i} p_i E_i \tag{11.1} \\ 1 &= \displaystyle \sum_{i} p_i \tag{11.2} \end{align*}\)

The entropy is

\(S = k_B \displaystyle \sum_{i} p_i \ln \Big( \dfrac{1}{p_i} \Big) \tag{11.3}\)

where \(k_B = 1.38 × 10^{−23}\) Joules per Kelvin and is known as Boltzmann’s constant.

The probability distribution that maximizes \(S\) subject to a constraint like Equation 11.2 was presented in Chapter 9, Equation 9.12. That formula was for the case where entropy was expressed in bits; the corresponding formula for physical systems, with entropy expressed in Joules per Kelvin, is the same except for the use of \(e\) rather than 2:

\(p_i = e^{-\alpha}e^{-\beta E_i} \tag{11.4}\)

so that

\(\ln \Big( \dfrac {1}{p_i}\Big) = \alpha + \beta E_i \tag{11.5}\)

The sum of the probabilities must be 1 and therefore

\(\alpha = \ln \Big ( \displaystyle \sum_{i} e^{-\beta E_i} \Big) \tag{11.6}\)

As expressed in terms of the Principle of Maximum Entropy, the objective is to find the various quantities given the expected energy \(E\). However, except in the simplest circumstances it is usually easier to do calculations the other way around. That is, it is easier to use \(\beta\) as an independent variable, calculate \(\alpha\) in terms of it, and then find the \(p_i\) and then the entropy \(S\) and energy \(E\).