Skip to main content
Engineering LibreTexts

17.6: Simpson’s Paradox

  • Page ID
    48427
    • Eric Lehman, F. Thomson Leighton, & Alberty R. Meyer
    • Google and Massachusetts Institute of Technology via MIT OpenCourseWare
    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    In 1973, a famous university was investigated for gender discrimination[5]. The investigation was prompted by evidence that, at first glance, appeared definitive: in 1973, 44% of male applicants to the school’s graduate programs were accepted, but only 35% of female applicants were admitted.

    However, this data turned out to be completely misleading. Analysis of the individual departments, showed not only that few showed significant evidence of bias, but also that among the few departments that did show statistical irregularities, most were slanted in favor of women. This suggests that if there was any sex discrimination, then it was against men!

    Given the discrepancy in these findings, it feels like someone must be doing bad math—intentionally or otherwise. But the numbers are not actually inconsistent. In fact, this statistical hiccup is common enough to merit its own name: Simpson’s Paradox occurs when multiple small groups of data all exhibit a similar trend, but that trend reverses when those groups are aggregated. To explain how this is possible, let’s first clarify the problem by expressing both arguments in terms of conditional probabilities. For simplicity, suppose that there are only two departments, EE and CS. Consider the experiment where we pick a random candidate. Define the following events:

    • \(A\) ::= the candidate is admitted to his or her program of choice,
    • \(F_{EE}\) ::= the candidate is a woman applying to the EE department,
    • \(F_{CS}\) ::= the candidate is a woman applying to the CS department,
    • \(M_{EE}\) ::= the candidate is a man applying to the EE department,
    • \(M_{CS}\) ::= the candidate is a man applying to the CS department.

    \[\nonumber \begin{array}{c=cc}
    \text{CS} & \text{2 men admitted out of 5 candidates} & 40\%\\ & \text{50 women admitted out of 100 candidates} & 50\%\\
    \text{EE} & \text{70 men admitted out of 100 candidates} & 70\%\\ & \text{4 women admitted out of 7 candidates} & 80\%\\
    \hline \text{Overall} & \text{72 men admitted, 105 candidates} & \approx 69\%\\ & \text{54 women admitted, 105 candidates} & \approx 51\%\\ \end{array}\]

    Table 17.1 A scenario in which men are overall more likely than women to be admitted to a school, despite being less likely to be admitted into any given program.

    Assume that all candidates are either men or women, and that no candidate belongs to both departments. That is, the events \(F_{EE}, F_{CS}, M_{EE},\) and \(M_{CS}\) are all disjoint.

    In these terms, the plaintiff is making the following argument:

    \[\nonumber \text{Pr}[A \mid M_{EE} \cup M_{CS}] > \text{Pr}[A \mid F_{EE} \cup F_{CS}].\]

    In plain English, across the university, the total probability that a woman candidate is admitted is less than the probability for a man.

    The university retorts that in any given department, a woman candidate has chances equal to or greater than those of a male candidate; more formally, that

    \[\begin{aligned}\text{Pr}[A \mid M_{EE}] &\leq \text{Pr}[A \mid F_{EE}] \quad \text{and} \\ \text{Pr}[A \mid M_{CS}] &\leq \text{Pr}[A \mid F_{CS}]. \end{aligned}\]

    It is easy to believe that these two positions are contradictory . But Table 17.1 shows a set of admission statistics for which the assertions of both the plaintiff and the university hold. In this case, a higher percentage of female applicants were admitted to each department, but overall a higher percentage of males were accepted! So the apparently contradictory claims can in fact both be true. How can we make sense of this seemingly paradoxical situation?

    Initially, we and the plaintiffs both assumed that the overall admissions statistics for the university could only be explained by discrimination. However, the department-by-department breakdown shows that the source of the discrepancy is that the CS department lets in about 20% fewer candidates overall, but attracts a far larger number of woman applicants than the more permissive EE department3. This leads us to the conclusion that the admissions gap in not due to any systematic bias on the school’s part.

    But suppose we replaced “the candidate is a man/woman applying to the EE department,” by “the candidate is a man/woman for whom an admissions decision was made during an odd-numbered day of the month,” and likewise with CS and an even-numbered day of the month. Since we don’t think the parity of a date is a cause for the outcome of an admission decision, we would most likely dismiss the “coincidence” that on both odd and even dates, women are more frequently admitted. Instead we would judge, based on the overall data showing women less likely to be admitted, that gender bias against women was an issue in the university.

    Bear in mind that it would be the same numerical data that we would be using to justify our different conclusions in the department-by-department case and the even-day-odd-day case. We interpreted the same numbers differently based on our implicit causal beliefs, specifically that departments matter and date parity does not. It is circular to claim that the data corroborated our beliefs that there is or is not discrimination. Rather, our interpretation of the data correlation depended on our beliefs about the causes of admission in the first place.4 This example highlights a basic principle in statistics which people constantly ignore: never assume that correlation implies causation.

    3At the actual university in the lawsuit, the “exclusive” departments more popular among women were those that did not require a mathematical foundation, such as English and education. Women’s disproportionate choice of these careers reflects gender bias, but one which predates the university’s involvement.

    4These issues are thoughtfully examined in Causality: Models, Reasoning and Inference, Judea Pearl, Cambridge U. Press, 2001.


    This page titled 17.6: Simpson’s Paradox is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Eric Lehman, F. Thomson Leighton, & Alberty R. Meyer (MIT OpenCourseWare) .

    • Was this article helpful?