4.4: Scaling exponents in mathematics

Last updated
Save as PDF

Page ID: 24103

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Our scaling relations so far have connected physical quantities. But proportional reasoning can also bring us insight in mathematics. A classic example is the birthday paradox.

How many people must be in a room before the probability that two people share a birthday (for example, two people are born on July 18) is at least 50 percent?

Almost everyone, including me, reasons that, with 365 days in the year, we need roughly 50 percent of 365, or 183 people. Let’s test this conjecture. The actual probability of having a shared birthday is

\[1-(1 - \frac{1}{365})(1-\frac{2}{365})(1- \frac{3}{365}) ... (1-\frac{n-1}{365}).\]

(A derivation is in Everyday Probability and Statistics [50, p. 49] or in the advanced classic Probability Theory and its Application [13, vol. 1, p. 33]. But you can explain its structure already: Try Problem 4.23.) For n = 183, the probability is 1 − 4.8×10⁻²⁵ or almost exactly 1.

Exercise \(\PageIndex{1}\): Explaining the probability of a shared birthday

Explain the pieces of the formula for the probability of a shared birthday: What is the reason for the 1− in front of the product? Why does each factor in the product have a 1 − ? And why does the last factor have (n − 1)/365 rather than n/365?

To make this surprisingly high probability seem plausible, I gave random birthdays to 183 simulated people. Two people shared a birthday on 14 days (and we need only one such day); and three people shared a birthday on three days. According to this simulation and to the exact calculation, the plausible first guess of 183 people is far too high. This surprising result is the birthday paradox.

Even though we could use the exact probability to find the threshold number of people that makes the probability greater than 50 percent, we would still wonder why: Why is the plausible argument that n ≈ 0.5 × 365 so badly wrong? That insight comes from a scaling analysis.

At roughly what n does the probability of a shared birthday rise above 50 percent?

An image often helps me translate a problem into mathematics. Imagine everyone in the room greeting all the others by shaking hands, one handshake at a time, and checking for a shared birthday with each handshake.

How many handshakes happen?

Each person shakes hands with n-1 others, making n(n-1) arrows from one person to another. But a handshake is shared between two people. To avoid counting each handshake twice, we need to divide n(n-1) by 2. Thus, there are n(n-1)/2 or roughly n²/2 handshakes.

With 365 possible birthdays, the probability, per handshake, of a shared birthday is 1/365. With n²/2 handshakes, the probability that no handshake joins two people with a shared birthday is approximately

\[(1-\frac{1}{365})^{n^{2}/2}.\]

To approximate this probability, take its natural logarithm:

\[ln(1-\frac{1}{365})^{n^{2}/2} = \frac{n^{2}}{2} ln (1-\frac{1}{365}).\]

Then approximate the logarithm using \(ln(1 +x) \approx \: x \: (for \: x << 1):\)

\[ln(1-\frac{1}{365}) \approx -\frac{1}{365}.\]

(You can test this useful approximation using a calculator, or see the pictorial explanation in Street-Fighting Mathematics [33, Section 4.3].) With that approximation,

\[ln(\textrm{probability of no shared birthday}) \approx - \frac{n^{2}}{2} \times \frac{1}{365}\]

When the probability of no shared birthday falls below 0.5, the probability of a shared birthday rises above 0.5. Therefore, the condition for having enough people is

\[ln(\textrm{probability of no shared birthday}) < ln \frac{1}{2}.\]

Because ln(1/2) = -ln2, the condition simplifies to \(\frac{n^{2}}{2 \times 365} > ln2.\)

Using the approximation ln 2 ≈ 0.7, the threshold n is approximately 22.6:

\[n > \sqrt{ln2 \times 2 \times 365} \approx 22.6.\]

From this scaling analysis, we can compactly explain what went wrong with the plausible conjecture. It assumed that the shared-birthday probability p is proportional to the number of people n, reaching p = 0.5 when n = 0.5 × 365. The handshake picture, however, shows that the probability is related to the number of handshakes, which is proportional to n². What a difference from a simple change in a scaling exponent!

Exercise \(\PageIndex{2}\): Three people sharing a birthday

Extend our scaling analysis to the three-birthday problem: How many people must be in a room before the probability of three people sharing a birthday rises above 0.5? (The results of the exact calculation, along with many approximations, are given by Diaconis and Mosteller [10].)

Exercise \(\PageIndex{3}\): Scaling of bubble sort

The simplest algorithm for sorting—for example, to sort a list of n web pages according to relevance—is bubble sort. You go through the list in passes, comparing neighboring items and swapping any that are out of order.

a. How many passes through the list do you need in order to guarantee that the list is sorted?

b. The running time t (the time to sort the list) is proportional to the number of comparisons. What is the scaling exponent \(\beta\) in \(t \: \propto \: n^{\beta}\)?

Exercise \(\PageIndex{4}\): Scaling of merge sort

Bubble sort (Problem \(\PageIndex{3}\)) is easy to describe, but there is an alternative that is almost as easy: merge sort. It is a recursive, divide-and-conquer algorithm. You divide the list of n items into two equal parts (assume that n is a power of 2), and sort each half using merge sort. To make the sorted list, just merge the two sorted halves.

a. Here is a list of eight randomly generated numbers: 98, 33, 34, 62, 31, 58, 61, and 15. Draw a tree illustrating how merge sort works on this list. Use two lines for each internal node, showing on one line the original list and on the second line the sorted list.

b. The running time t is proportional to the number of comparison operations. What are the scaling exponents \(\alpha\) and \(\beta\) in

\[t \: \alpha \: n^{\alpha} (\log n)^{\beta}?\]

c. If you were a revenue agency and had to sort the tax records of all residents in a country, which algorithm would you use, merge sort or bubble sort?

Exercise \(\PageIndex{5}\): Scaling of standard multiplication

In the usual school algorithm for multiplying n digit numbers, you find and add partial products. If the running time t is proportional to the number of single-digit multiplications, what is the scaling exponent \(\beta\) in \(t \: \propto \: n \beta\)?

Exercise \(\PageIndex{6}\): Scaling of Karatsuba multiplication

The Karatsuba algorithm for multiplying two n-digit numbers, discovered by Anatoly Karatsuba in 1960 [28] and published in 1962 [29], was the first development in the theory of multiplication in centuries. Similarly to merge sort, you first break each number into two equal-length halves. For example, you split 2743 into 27 and 43. Using Karatsuba multiplication recursively, you form three products from those halves, and combine the three products to get the original product. The subdividing stops when the numbers are short enough to be multiplied by the computer’s hardware (typically, at 32 or 64 bits).

The expensive step, repeated many times, is hardware multiplication, so the running time t is proportional to the number of hardware multiplications. Find the scaling exponent \(\beta\) in \(t \: \propto \: n \beta \). (You will find a scaling exponent that occurs rarely in physical scalings, namely an irrational number.) How does this \(\beta\) compare with the exponent in the school algorithm for multiplication (Problem \(\PageIndex{5}\))?