7.7: Dynamic Programming

Last updated
Save as PDF

Page ID: 47264

Franz S. Hover & Michael S. Triantafyllou
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We introduce a very powerful approach to solving a wide array of complicated optimization problems, especially those where the space of unknowns is very high, e.g., it is a trajectory itself, or a complex sequence of actions, that is to be optimized. Only an introductory description here is given, focusing on shortest-path problems. A great many procedure and planning applications can be cast as shortest-path problems.

We begin with the essential concept. Suppose that we are driving from Point A to Point C, and we ask what is the shortest path in miles. If A and C represent Los Angeles and Boston, for example, there are many paths to choose from! Assume that one way or another we have found the best path, and that a Point B lies along this path, say Las Vegas. Let \(X\) be an arbitrary point east of Las Vegas. If we were to now solve a new optimization problem for getting only from Las Vegas to Boston, this same arbitrary point \(X\) would be along the new optimal path as well.

The point is a subtle one: the optimization problem from Las Vegas to Boston is easier than that from Los Angeles to Boston, and the idea is to use this property backwards through time to evolve the optimal path, beginning in Boston.

Visualization of the problem given immediately below, showing that there are 2 layers of nodes and 3 nodes in each layer. — Figure \(\PageIndex{1}\): visualization of the example problem immediately below. \(s=2\), i.e. there are two layers of nodes (the Rocky Mountains and the Mississippi River), and \(n=3\), i.e. there are three nodes or locations where it is possible to cross each of these obstacles.

Example: Nodal Travel. We now add some structure to the above experiment. Consider now traveling from point A (Los Angeles) to point D (Boston). Suppose there are only three places to cross the Rocky Mountains, \(B_1, \, B_2, \, B_3\), and three places to cross the Mississippi River, \(C_1, \, C_2, \, C_3\). By way of notation, we say that the path from \(A\) to \(B_1\) is \(AB_1\). Suppose that all of the paths (and distances) from \(A\) to the \(B\)-nodes are known, as are those from the \(B\)-nodes to the \(C\)-nodes, and the \(C\)-nodes to the terminal point \(D\). There are nine unique paths from \(A\) to \(D\).

A brute-force approach sums up the total distance for all the possible paths, and picks the shortest one. In terms of computations, we could summarize that this method requires nine additions of three numbers, equivalent to eighteen additions of two numbers. The comparison of numbers is relatively cheap.

The dynamic programming approach has two steps. First, from each \(B\)-node, pick the best path to \(D\). There are three possible paths from \(B_1\) to \(D\), for example, and nine paths total from the \(B\)-level to \(D\). Store the best paths as \(B_1 D|_{opt}, \, B_2 D|_{opt}, \, B_3 D|_{opt}\). This operation involves nine additions of two numbers. Second, compute the distance for each of the possible paths from \(A\) to \(D\), constrained to the optimal paths from the \(B\)-nodes onward: \(A B_1 + B_1 D|_{opt}, \, A B_2 + B_2 D|_{opt},\) or \(A B_3 + B_3 D|_{opt}\). The combined path with the shortest distance is the total solution; this second step involves three sums of two numbers, and the total optimization is done in twelve additions of two numbers.

Needless to say, this example gives only a mild advantage to the dynamic programming approach over brute force. The gap widens vastly, however, as one increases the dimensions of the solution space. In general, if there are \(s\) layers of nodes (e.g., rivers or mountain ranges), and each has width \(n\) (e.g., \(n\) river crossing points), the brute force approach will take \(s n^s\) additions, while the dynamic programming procedure involves only \(n^2(s-1)+n)\) additions. In the case of \(n=5, \, s=5\), brute force requires \(15625\) additions; dynamic programming needs only \(105\)!