Skip to main content
Engineering LibreTexts

10.3: Confounding Factors

  • Page ID
    39266
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Let me speak to two of the items in the Figure 10.3.1 table in particular. The third one on the list, external causation, is a case where a third variable (call it C) comes into play. We refer to this as a confounding factor (or confounding variable) because it “confounds” us: causes us to interpret the meaning behind the data in an incorrect way. The example in the table is a famous and funny one: clearly sharks don’t react to Ben & Jerry’s daily net profits, and people (probably) don’t run out and buy ice cream to cope with their anxiety about shark attacks. Neither A → B nor B → A, but a third variable – hot days – influence both of them.

    Now of course it’s not always this obvious. Here’s an example I ran across recently. A magazine article reported on a new health scare: scientists have discovered that eating barbecue can increase your risk of cancer. Pictorially, this claim is illustrated in the causal diagram in Figure 10.3.2, which shows our i.v. and our d.v. ; the arrow means exactly what it meant earlier.

    clipboard_e63e610af97e20cabd99f6cf8cf184879.png

    Figure \(\PageIndex{1}\): Various types of causality that could be the underlying reason why an association between A and B exists.

    Unlike sharks and ice cream, this one seems plausible. And I’m not claiming to have read enough about their study to tell whether the researchers’ claim is bogus. But I couldn’t help thinking that there are a great many possibly confounding factors that could be blurring the results. For one, choosing to eat barbecue a lot is probably often associated with a less healthy, higher-fat diet in general (I can speak from experience on that). If that’s true, and if high-fat diets – whether featuring lots of barbecue or not – are associated with these same poor health outcomes, then we’d have the picture on the left-side of Figure 10.3.3. The red bubble represents the confounding factor, which is influencing both i.v. and d.v. If this picture were the correct one of the underlying phenomenon, then the correlation we thought were picking up between barbecue and cancer was actually due to fat content.

    clipboard_e4bd6fbb9aaa7d419b9bb3ca11f12b3a6.png

    Figure \(\PageIndex{2}\): A hypothesis as to causality: eating barbecued foods increases one’s risk for certain types of cancer.

    Another example is the right-hand side of Figure 10.3.3, below. Perhaps barbecue is more popular culturally in some areas of the country (say, the South, where I certainly see it eaten a lot), and perhaps those areas have other environmental factors that can lead to cancer. In this case, the “South” confounder indirectly affects the d.v. (via another variable, the environment) but it still affects it.

    It’s not hard to think of others. These were just the first two that came to mind. The point is that it’s really hard to be sure you’ve thought of all of them!

    Paranoia and Overparanoia

    All this should lead you to be somewhat paranoid, but not over - paranoid. Confounding variables can definitely lead us to make mistakes in our reasoning, but perhaps they’re not quite as common as you think. Understand that a confounding factor is not simply any other factor that affects the dependent variable. Instead, for a variable to be confounding it must affect both the independent and the dependent variable.

    clipboard_e462913daabe95f114ff6306a35fecd4c.png

    Figure \(\PageIndex{3}\): Other hypotheses as to causality, each resulting in the same associations in the data, yet involving confounding factors.

    Let me illustrate with an example. I suspect that on average, men are taller than women. And I further suspect that there’s causality here, and that it goes from A (sex) to B (height), not the other way around. (Clearly people don’t spontaneously “turn male” because they reach a certain height.) So my thinking on the subject is summed up in Figure 10.3.4.

    clipboard_eccaa9d949428fc32b6479d7622d488bc.png

    Figure \(\PageIndex{4}\): Stephen’s hypothesis: a person’s biological sex (male or female) plays a causal role in determining their height.

    Now let me show you what I mean by “overparanoia.” What if someone said, “but wait, Stephen, not so fast! You’ve got potential confounding variables out the wazoo! Why, surely heredity plays some role in a person’s height – tall parents are more likely to have tall offspring, just due to genetics. And nutrition, too, is a factor: it’s been demonstrated that impoverished communities suffering from malnutrition will have children with stunted growth. And heck, if you’re born at a high elevation (like Nepal), there’s less gravitational pull dragging your body down to earth, so it stands to reason that you’ll probably grow taller. And on and on!” Figure 10.3.5 depicts this (supposed) scientific nightmare.

    But plausible as some of those theories are, they are not confounding variables! These are simply other factors that may affect the d.v. Sure, they may also play a causal role in determining a person’s height, but they do not invalidate our finding about sex and height.

    For them to truly be confounders, they would have to affect the yellow and the green variable, and I’m pretty sure they do not. Do tall parents tend to bear more sons, and short parents more daughters? If not, this isn’t a confounder. Do boys have more nutritious diets than girls? (In some parts of the world, that may unfortunately be true, but I don’t believe it is in our country.) So that one isn’t a confounder either. Having additional causes of an effect does not nullify a genuine effect. Only a lurking variable that pulls the marionette strings of both i.v. and d.v. can do that.

    clipboard_e8f72a06401eb19773ebda0a811c096a3.png

    Figure \(\PageIndex{5}\): Oh geez – confounding variables galore? No!


    This page titled 10.3: Confounding Factors is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Stephen Davies (allthemath.org) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.