Skip to main content
Engineering LibreTexts

5: Probability

  • Page ID
    50181
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    We have been considering a model of an information handling system in which symbols from an input are encoded into bits, which are then sent across a “channel” to a receiver and get decoded back into symbols. See Figure 5.1.

    Screen Shot 2021-05-02 at 1.17.33 PM.png
    Figure 5.1: Communication system

    In earlier chapters of these notes we have looked at various components in this model. Now we return to the source and model it more fully, in terms of probability distributions.

    The source provides a symbol or a sequence of symbols, selected from some set. The selection process might be an experiment, such as flipping a coin or rolling dice. Or it might be the observation of actions not caused by the observer. Or the sequence of symbols could be from a representation of some object, such as characters from text, or pixels from an image.

    We consider only cases with a finite number of symbols to choose from, and only cases in which the symbols are both mutually exclusive (only one can be chosen at a time) and exhaustive (one is actually chosen). Each choice constitutes an “outcome” and our objective is to trace the sequence of outcomes, and the information that accompanies them, as the information travels from the input to the output. To do that, we need to be able to say what the outcome is, and also our knowledge about some properties of the outcome.

    If we know the outcome, we have a perfectly good way of denoting the result. We can simply name the symbol chosen, and ignore all the rest of the symbols, which were not chosen. But what if we do not yet know the outcome, or are uncertain to any degree? How are we supposed to express our state of knowledge if there is uncertainty? We will use the mathematics of probability for this purpose.

    To illustrate this important idea, we will use examples based on the characteristics of MIT students. The official count of students at MIT\(^1\) for Fall 2007 includes the data in Table 5.1, which is reproduced in Venn diagram format in Figure 5.2.

    Women Men Total
    Freshmen 496 577 1,073
    Undergraduates 1,857 2,315 4,172
    Graduate Students 1,822 4,226 6,048
    Total Students 3,679 6,541 10,220
    Table 5.1: Demographic data for MIT, Fall 2007
    Screen Shot 2021-05-02 at 1.20.48 PM.png
    Figure 5.2: A Venn diagram of MIT student data, with areas that should be proportional to the sizes of the subpopulations.

    Suppose an MIT freshman is selected (the symbol being chosen is an individual student, and the set of possible symbols is the 1073 freshmen), and you are not informed who it is. You wonder whether it is a woman or a man. Of course if you knew the identity of the student selected, you would know the gender. But if not, how could you characterize your knowledge? What is the likelihood, or probability, that a woman was selected?

    Note that 46% of the 2007 freshman class (496/1,073) are women. This is a fact, or a statistic, which may or may not represent the probability the freshman chosen is a woman. If you had reason to believe that all freshmen were equally likely to be chosen, you might decide that the probability of it being a woman is 46%. But what if you are told that the selection is made in the corridor of McCormick Hall (a women’s dormitory)? In that case the probability that the freshman chosen is a woman is probably higher than 46%. Statistics and probabilities can both be described using the same mathematics (to be developed next), but they are different things.


    \(^1\)all students: http://web.mit.edu/registrar/www/sta...portfinal.html, all women: http://web.mit.edu/registrar/www/sta...omenfinal.html


    This page titled 5: Probability is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Paul Penfield, Jr. (MIT OpenCourseWare) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.