15.2: Counting with Generating Functions

Last updated
Save as PDF

Page ID: 48407

Eric Lehman, F. Thomson Leighton, & Alberty R. Meyer
Google and Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Generating functions are particularly useful for representing and counting the number of ways to select \(n\) things. For example, suppose there are two flavors of donuts, chocolate and plain. Let dn be the number of ways to select \(n\) chocolate or plain flavored donuts. \(d_n = n + 1\), because there are \(n + 1\) such donut selections—all chocolate, 1 plain and \(n - 1\) chocolate, 2 plain and \(n - 2\) chocolate,... , all plain. We define a generating function, \(D(x)\), for counting these donut selections by letting the coefficient of \(x^n\) be \(d_n\). This gives us equation (15.1.5)

\[\label{15.2.1} D(x) = \dfrac{1}{(1-x)^2}.\]

Apples and Bananas too

More generally, suppose we have two kinds of things—say, apples and bananas— and some constraints on how many of each may be selected. Say there are \(a_n\) ways to select \(n\) apples and \(b_n\) ways to select \(n\) bananas. The generating function for counting apples would be

\[\nonumber A(x) ::= \sum_{n=0}^{\infty} a_n x^n,\]

and for bananas would be

\[\nonumber B(x) ::= \sum_{n=0}^{\infty} b_n x^n,\]

Now suppose apples come in baskets of 6, so there is no way to select 1 to 5 apples, one way to select 6 apples, no way to select 7, etc. In other words,

\[\nonumber a_n::=\left\{\begin{array}{ll} 1 & \text { if } n \text { is a multiple of 6,} \\ 0 & \text { otherwise.} \end{array}\right.\]

In this case we would have

\[\begin{aligned} A(x) &= 1+ x^6 + x^{12} + \cdots + x^{6n} + \cdots \\ &= 1 + y + y^2 + \cdots + y^n + \cdots \quad \text{where } y = x^6.\\ &= \dfrac{1}{1-y} = \dfrac{1}{1-x^6}. \end{aligned}\]

Let’s also suppose there are two kinds of bananas—red and yellow. Now, \(b_n = n + 1\) by the same reasoning used to count selections of \(n\) chocolate and plain donuts, and by (\ref{15.2.1}) we have

\[\nonumber B(x) = \dfrac{1}{(1-x)^2}.\]

So how many ways are there to select a mix of \(n\) apples and bananas? First, we decide how many apples to select. This can be any number \(k\) from 0 to \(n\). We can then select these apples in \(a_k\) ways, by definition. This leaves \(n - k\) bananas to be selected, which by definition can be done in \(b_{n-k}\) ways. So the total number of ways to select \(k\) apples and \(n - k\) bananas is \(a_k b_{n-k}\). This means that the total number of ways to select some size \(n\) mix of apples and bananas is

\[\label{15.2.2} a_0 b_n + a_1 b_{n-1} + a_2 b_{n-2} + \cdots + a_n b_0.\]

Products of Generating Functions

Now here’s the cool connection between counting and generating functions: expression (\ref{15.2.2}) is equal to the coefficient of xn in the product \(A(x)B(x).\)

In other words, we’re claiming that

\[[x^n](A(x)\cdot B(x)) = a_0 b_n + a_1 b_{n-1} + a_2 b_{n-2} + \cdots + a_n b_0.\]

Rule (Product).

To explain the generating function Product Rule, we can think about evaluating the product \(A(x) \cdot B(x)\) by using a table to identify all the cross-terms from the product of the sums:

\[\nonumber \begin{array}{c|ccccc}
& b_{0} x^{0} & b_{1} x^{1} & b_{2} x^{2} & b_{3} x^{3} & \ldots \\
\hline a_{0} x^{0} & a_{0} b_{0} x^{0} & a_{0} b_{1} x^{1} & a_{0} b_{2} x^{2} & a_{0} b_{3} x^{3} & \ldots \\
a_{1} x^{1} & a_{1} b_{0} x^{1} & a_{1} b_{1} x^{2} & a_{1} b_{2} x^{3} & \ldots & \\
a_{2} x^{2} & a_{2} b_{0} x^{2} & a_{2} b_{1} x^{3} & \ldots & & \\
a_{3} x^{3} & a_{3} b_{0} x^{3} & \ldots & & & \\
\vdots & \ldots & & & &
\end{array}\]

In this layout, all the terms involving the same power of \(x\) lie on a 45-degree sloped diagonal. So, the index-\(n\) diagonal contains all the \(x^n\)-terms, and the coefficient of \(x^n\) in the product \(A(x) \cdot B(x)\) is the sum of all the coefficients of the terms on this diagonal, namely, (\ref{15.2.2}). The sequence of coefficients of the product \(A(x) \cdot B(x)\) is called the convolution of the sequences \((a_0, a_1, a_2, \ldots)\) and \((b_0, b_1, b_2, \ldots)\). In addition to their algebraic role, convolutions of sequences play a prominent role in signal processing and control theory.

This Product Rule provides the algebraic justification for the fact that a geometric series equals \(1/(1-x)\) regardless of convergence. Specifically, the constant 1 describes the generating function

\[\nonumber 1 = 1 + 0x + 0x^2 + \cdots + 0x^n + \cdots.\]

Likewise, the expression \(1 - x\) describes the generating function

\[\nonumber 1 - x = 1 + (-1)x + 0x^2 + \cdots + 0x^n + \cdots.\]

So for the series \(G(x)\) whose coefficients are all equal to 1, the Product Rule implies in a purely formal way that

\[\nonumber (1 - x) \cdot G(x) = 1 + 0x + 0x^2 + \cdots + 0x^n + \cdots = 1.\]

In other words, under the Product Rule, the geometric series \(G(x)\) is the multiplicative inverse, \(1/(1-x)\), of \(1 - x\).

Similar reasoning justifies multiplying a generating function by a constant term by term. That is, a special case of the Product Rule is the

Rule (Constant Factor). For any constant, \(c\), and generating function, \(F(x)\),

\[[x^n](c \cdot F(x)) = c \cdot [x^n]F(x).\]

The Convolution Rule

We can summarize the discussion above with the

Rule (Convolution). Let \(A(x)\) be the generating function for selecting items from a set \(A\), and let \(B(x)\) be the generating function for selecting items from a set \(B\) disjoint from \(A\). The generating function for selecting items from the union \(A \cup B\) is the product \(A(x) \cdot B(x)\).

The Rule depends on a precise definition of what “selecting items from the union \(A \cup B\)” means. Informally, the idea is that the restrictions on the selection of items from sets \(A\) and \(B\) carry over to selecting items from \(A \cup B\)¹.

Counting Donuts with the Convolution Rule

We can use the Convolution Rule to derive in another way the generating function \(D(x)\) for the number of ways to select chocolate and plain donuts given in (15.6). To begin, there is only one way to select exactly \(n\) chocolate donuts. That means every coefficient of the generating function for selecting \(n\) chocolate donuts equals one. So the generating function for chocolate donut selections is \(1/(1-x)\); likewise for the generating function for selecting only plain donuts. Now by the Convolution Rule, the generating function for the number of ways to select \(n\) donuts when both chocolate and plain flavors are available is

\[\nonumber D(x) = \dfrac{1}{1-x} \cdot \dfrac{1}{1-x} = \dfrac{1}{(1-x)^2}.\]

So we have derived (\ref{15.2.1}) without appeal to (15.1.5).

Our application of the Convolution Rule for two flavors carries right over to the general case of \(k\) flavors; the generating function for selections of donuts when \(k\) flavors are available is \(1 / (1-x)^k\). We already derived the formula for the number of ways to select a \(n\) donuts when \(k\) flavors are available, namely, \({n + (k - 1) \choose n}\) from Corollary 14.5.3. So we have

\[\label{15.2.5} [x^n] \left( \dfrac{1}{(1-x)^k}\right) = {n + (k-1) \choose n}.\]

Extracting Coefficients from Maclaurin’s Theorem

We’ve used a donut-counting argument to derive the coefficients of \(1/(1 - x)^k\), but it’s instructive to derive this coefficient algebraically, which we can do using Maclaurin’s Theorem:

Theorem \(\PageIndex{1}\)

(Maclaurin’s Theorem).

\[\nonumber f(x) = f(0) + f'(0)x + \dfrac{f''(0)}{2!}x^2 + \dfrac{f'''(0)}{3!}x^3 + \cdots + \dfrac{f^{(n)}(0)}{n!}x^n + \cdots.\]

This theorem says that the \(n\)th coefficient of \(1/(1-x)^k\) is equal to its \(n\)th derivative evaluated at 0 and divided by \(n!\). Computing the \(n\)th derivative turns out not to be very difficult

\[\nonumber \dfrac{d^n}{d^n x} \dfrac{1}{(1-x)^k} = k(k+1) \cdot (k + n - 1)(1-x)^{-(k+n)}\]

(see Problem 15.5), so

\[\begin{aligned} [x^n] \left( \dfrac{1}{(1-x)^k}\right) &= \left( \dfrac{d^n}{d^n x} \dfrac{1}{(1-x)^k} \right) (0) \dfrac{1}{n!}\\ &= \dfrac{k(k+1)\cdots (k+n - 1)(1-0)^{-(k+n)}}{n!} \\ &= {n+(k-1) \choose n}. \end{aligned}\]

In other words, instead of using the donut-counting formula (\ref{15.2.5}) to find the coefficients of \(x^n\), we could have used this algebraic argument and the Convolution Rule to derive the donut-counting formula.

The Binomial Theorem from the Convolution Rule

The Convolution Rule also provides a new perspective on the Binomial Theorem 14.6.4. Here’s how: first, work with the single-element set \(\{a_1\}\). The generating function for the number of ways to select \(n\) different elements from this set is simply \(1 + x\): we have 1 way to select zero elements, 1 way to select the one element, and 0 ways to select more than one element. Similarly, the number of ways to select \(n\) elements from any single-element set \(\{a_i\}\) has the same generating function \(1+x\). Now by the Convolution Rule, the generating function for choosing a subset of \(n\) elements from the set \(\{a_1, a_2, \ldots, a_m\}\) is the product, \((1+x)^m\), of the generating functions for selecting from each of the \(m\) one-element sets. Since we know that the number of ways to select \(n\) elements from a set of size \(m\) is \({m \choose n}\), we conclude that that

\[\nonumber [x^n](1+x)^m = {m \choose n},\]

which is a restatement of the Binomial Theorem 14.6.4. Thus, we have proved the Binomial Theorem without having to analyze the expansion of the expression \((1+x)^m\) into a sum of products.

These examples of counting donuts and deriving the binomial coefficients illustrate where generating functions get their power:

Generating functions can allow counting problems to be solved by algebraic manipulation, and conversely, they can allow algebraic identities to be derived by counting techniques.

Absurd Counting Problem

So far everything we’ve done with generating functions we could have done another way. But here is an absurd counting problem—really over the top! In how many ways can we fill a bag with \(n\) fruits subject to the following constraints?

The number of apples must be even.
The number of bananas must be a multiple of 5.
There can be at most four oranges.
There can be at most one pear.

For example, there are 7 ways to form a bag with 6 fruits:

\[\nonumber \begin{array}{c|ccccccc}
\text { Apples } & 6 & 4 & 4 & 2 & 2 & 0 & 0 \\
\text { Bananas } & 0 & 0 & 0 & 0 & 0 & 5 & 5 \\
\text { Oranges } & 0 & 2 & 1 & 4 & 3 & 1 & 0 \\
\text { Pears } & 0 & 0 & 1 & 0 & 1 & 0 & 1
\end{array}\]

These constraints are so complicated that getting a nice answer may seem impossible. But let’s see what generating functions reveal.

First, we’ll construct a generating function for choosing apples. We can choose a set of 0 apples in one way, a set of 1 apple in zero ways (since the number of apples must be even), a set of 2 apples in one way, a set of 3 apples in zero ways, and so forth. So, we have:

\[\nonumber A(x) = 1 + x^2 + x^4 + x^6 + \cdots = \dfrac{1}{1-x^2}\]

Similarly, the generating function for choosing bananas is:

\[\nonumber B(x) = 1 + x^5 + x^{10} + x^{15} + \cdots = \dfrac{1}{1-x^5}\]

Now, we can choose a set of 0 oranges in one way, a set of 1 orange in one way, and so on. However, we cannot choose more than four oranges, so we have the generating function:

\[\nonumber O(x) = 1 + x + x^{2} + x^{3} + x^{4} = \dfrac{1 - x^5}{1-x}.\]

Here the right hand expression is simply the formula (13.2) for a finite geometric sum. Finally, we can choose only zero or one pear, so we have:

\[\nonumber P(x) = 1 + x\]

The Convolution Rule says that the generating function for choosing from among all four kinds of fruit is:

\[\nonumber \begin{aligned} A(x)B(x)O(x)P(x) &= \dfrac{1}{1-x^2} \dfrac{1}{1-x^5} \dfrac{1 - x^5}{1-x} (1 + x) \\ &= \dfrac{1}{(1-x)^2} \\ &= 1 + 2x + 3x^2 + 4x^3 + \cdots \end{aligned}\]

Almost everything cancels! We’re left with \(1/(1-x)^2\), which we found a power series for earlier: the coefficient of \(x^n\) is simply \(n+1\). Thus, the number of ways to form a bag of \(n\) fruits is just \(n + 1\). This is consistent with the example we worked out, since there were 7 different fruit bags containing 6 fruits. Amazing!

¹Formally, the Convolution Rule applies when there is a bijection between \(n\)-element selections from \(A \cup B\) and ordered pairs of selections from the sets \(A\) and \(B\) containing a total of \(n\) elements. We think the informal statement is clear enough.