Chapter 2: Histograms, Statistical Measures, and Probability
- Page ID
- 99693
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Here you can find pre-recorded lecture videos that cover this topic here: https://youtube.com/playlist?list=PL...gEHsGBZJpQhGzn
Sample, Population, and Distribution Error
Before we start to analyze precision error we must understand two key concepts: i) distribution of error and ii) population from which a sample is taken.
• Distribution of error: characterizes the probability that an error of a given size will occur.
• Population from which a sample is drawn: experimentally we have a limited set of observations, our sample, from which we will infer the characteristics of the larger population.
Understanding Sample Vs. Population: Bag of Marbles
In any bag of marbles there will be a distribution of diameters. To estimate the mean diameter we can take a handful of marbles (sample) drawn from the bag (population).
No two handfuls will yield the exact same average value but each should approximate the average of the population to some level of uncertainty if the sample is large enough to approximate the population. The handful is a repeated sample whereas picking one marble would have been a single sample.
You can visually see this in the Mathematica Notebook simulation below where we create a population of marble diameters and take different sample sizes.
Visualizing Data: Histograms
One thing that you will immediately see here is that we are representing the data using a histogram. Histograms typically represent the frequency or number of times a measurand is measured. The y-axis typically represents frequency and the measurand value is on the x-axis. We will talk about distribution but the lower the frequency of a measurand the more rare it is to encounter such a measurand and the higher frequency the more likely it is to encounter a measurand.
Notice that the sample means vary from each sample grabbed from the bag and that as we grab larger and larger samples as you might anticipate we get a closer approximation to the true population mean.
Sampling:
• Sample of size n is drawn from a finite population of size p. We assume additional data cannot be added to the population and that n << p
• Finite number of items, n, is randomly drawn from a population of indefinite size and properties of population are inferred from the sample.
Sample Vs. Population Parameters
Population Mean and Standard Deviation
Typically, the population mean, µ, is an unknown since the population is infinite and as experimentalists we can never know the full population distribution. However it would be calculated as follows
\begin{equation}
\mu = \frac{x_{1} + x_{2} + ... + x_{n}}{n} = \sum\limits_{i=1}^n \frac{x_{i}}{n}
\end{equation}
By averaging a large sample we are able to estimate the true value of the population. However, we will develop a much more systematic approach of estimating bounds on the true value of a population, coming soon....
Standard Deviation
The deviation, d is the amount by which a single measurement deviates from the mean value of the population
\begin{equation}
d = x - \mu
\end{equation}
The mean squared deviation, \(\sigma^2\), is approximated by averaging the squared deviations of a very large sample:
\begin{equation}
\sigma \approx \sqrt{\frac{d_{1}^{2} + d_{2}^{2} + ... + d_{n}^{2}}{n}}
\end{equation}
σ is the standard deviation of the population and characterizes the deviation from the mean value and the width of the Gaussian, again more on this later.
Sample Mean and Standard Deviation
As we have previously discussed it is often impractical or at times impossible to work with an entire population, instead as experimentalists we work with samples from a population and we use average values from the sample to estimate the mean or standard deviation of the population.
The sample mean is defined as \(\overline{x}\)
\begin{equation}
\overline{x} = \sum\limits_{i=1}^n \frac{x_{i}}{n} = \frac{x_{1} + x_{2} + ... + x_{n}}{n}
\end{equation}
and as we have previously discussed the sample mean, \(\overline{x}\), can be used to approximate the population mean, µ, for large sample sizes, i.e. n ≥ 30. Similarly we can calculate the sample standard deviation, \(S_x\),
\begin{equation}
S_{x} = \sqrt{\frac{(x_{1} - \overline{x})^{2} + (x_{2} - \overline{x})^{2} + ... + (x_{n} - \overline{x})^{2}}{n-1}}
\end{equation}
which can be used to approximate the population standard deviation, \(\sigma\). Just a quick reminder that n is the number of data points/measurements in the sample.
But let’s go back to this idea of a histogram and frequencies of encountering measurements because this reminds me of probabilities, let’s refresh our understanding of this very fundamental and important concept.
Number of Possibilities
Before looking at probabilities is is often very important to determine the number of possibilities in a given scenario. If we have sets of A1,A2,...Ak which contain, respectively, n1,n2,...nk elements then there are n1 · n2 · · · nk ways of choosing A1, then A2 and finally Ak.
Let’s do an example to put this into practice.
Consider the following scenario where I am trying to maximize the yield of a field where I am planting lettuce there are two options of fertilizer, F1 and F2, 4 blocks of land to try to plant (B1-B4), and 3 types of seeding density (S1-S3). How possibilities of planting combination can we observe?
Well, from the given example there are n = 2 for our F scenario, 4 for B, and 3 for S so applying this rule we have
\begin{equation}
2\cdot 4 \cdot 3 = 24
\end{equation}
How about another quick problem, I make a true-false exam (I will never do that by the way) with 15 questions, how many combinations of answers can we observe?
\begin{equation}
2^{15}=32768
\end{equation}
Lot of grading options for me.
This concept is closely related to the number of ways to arrange objects distinctly, which you may remember was called.....
Permutations
If r objects are chosen from a set of n distinct objects any particular arrangement or order of these objects is a permutation. In such a scenario the number of permutations P is defined as
\begin{equation}
_n P_r = \frac{n!}{(n-r)!}
\end{equation}
Let’s consider an example, where we have 15 candidates, how ways can we choose a president, vice president, chief of staff, and treasurer.
Well, what are the number of objects, r? Here we see that r = 4 and the set of distinct objects here is n = 15.
\begin{equation}
\frac{15!}{11!}=32760
\end{equation}
Combinations
If we care about selecting r objects that can be selected from n distinct objects the number of combinations of n objects taken r at a time is denoted by
\begin{equation}
_n C_r = \frac{n!}{r!(n-r)!}
\end{equation}
Let’s say I have a class of 23 students how many ways can we choose teams of 3?
Here we have that n = 23 and r = 3
\begin{equation}
\frac{23!}{3!20!} = 1771
\end{equation}
How would you expect the number of combinations to change if we increase the team size to 8?
\begin{equation}
\frac{23!}{8!15!}=490314
\end{equation}
Probability
Now that we have the number of possibilities and permutations defined we can now start to define probability (P) which is simply defined as
\begin{equation}
P=\frac{s}{m}
\end{equation}
where there are m equally likely possibilities and s are the successful or favorable outcomes therefore the probability of a success is defined as above.
Let’s consider a single deck of completely randomly shuffled deck of cards with 52 cards in total. What is the probability of pulling a red Queen? Well here m = 52 and there are only two red queens in the deck so s = 2
\begin{equation}
\frac{2}{52} = \frac{1}{26}
\end{equation}
Probabilities: Mutually Exclusive Events
There are some scenarios where we may have some probabilities of multiple events occurring for example, if I have events A1,A2,...,An which are all mutually exclusive events in a sample space S, then
\begin{equation}
P(A_1 \cup A_2 \cup ...\cup A_n) = P(A_1) + P(A_2) +....+P(A_n)
\end{equation}
Let’s see how this can be applied to an example, continuing with our card example what is the probability of pulling a red card, club, or 2 of spades?
Well these are all mutually exclusive events, lets make sure this is indeed the case which is valid here because pulling a red card, club, or 2 of spades are mutually exclusive and will not change the probabilities., so we need to look at each events probability which will be
\begin{equation}
\frac{1}{2}+\frac{1}{4}+\frac{1}{52} =0.7692308
\end{equation}
Probabilities: Not Mutually Exclusive
Now if A and B are any events in the sample space S, not mutually exclusive then
\begin{equation}
P(A \cup B) = P(A) + P(B) - P(A \cap B)
\end{equation}
We would get an erroneous result if we applied the previous framework but we must exclude any overlapping probabilities.
Let’s consider this example, what is the probability of pulling a card that is red or an ace?
So the probability of pulling a red card is \(\frac{1}{2}\), the probability of pulling an ace is \(\frac{1}{13}\), and the probability that we pull a red ace is \(\frac{1}{26}\) so we find that
\begin{equation}
\frac{1}{2}+\frac{1}{13}-\frac{1}{26} = 0.5384615
\end{equation}
Probabilities: Conditional Probabilities
There are also conditional probabilities for example if A and B are any events in space S and \(P(B) \neq 0\) then the conditional probability of A given B is
\begin{equation}
P (A | B) = \frac{ P(A \cap B)}{P(B)}
\end{equation
Let’s illustrate this with an example and keep going with our cards as long as we can, so what is the conditional probability that we pull an ace given that it is also red?
So the way that we can read this is given that
\begin{equation}
\frac{\frac{1}{26}}{\frac{1}{2}} = 0.07692308
\end{equation}
If we rearrange this conditioanl probability we can show that
\begin{eqnarray}
P(A \cap B) = P(A) \cdot P(B|A)\\
P(A \cap B) = P(B) \cdot P(A|B) \\
\end{eqnarray}
this is assuming that \(P(A) \neq 0\) and \(P(B) \neq 0\). Then it follows that two events A and B will be independent events if and only if
\begin{equation}
P(A \cap B) = P(A) \cdot P (B)
\end{equation}
Let’s see if our card example was truly independent. We said that the probability for pulling a red ace was \(\frac{1}{26}\) and the probability for a card being red was \(P(A) = \frac{1}{2}\) and the probability of pulling an ace was \(P(B) = \frac{1}{13}\) so we find that
\begin{equation}
\frac{1}{26}=\frac{1}{13}=\frac{1}{26}
\end{equation}
So they are independent!! Now consider how this may change if the card is not replaced, a possible Pset question, hmmmmm???
Bayes’ Theorem
These multiplication rules for probabilities are extremely useful but we can also imagine several steps in calculating a probability so imagine B1,B2,....,Bn are mutually exclusive events of which one must occur then the general expression which is sometimes referred to as the rule of elimination or the rule of total probability
\begin{equation}
P(A) = \sum^n_{i=1} P(B_i) \cdot P(A |B_i)
\end{equation}
We can further generalize this problem by instead invoking the general Bayes’ theorem where if B1,B2,...Bn are mutually exclusive events of which one must occur then
\begin{equation}
P(B_r|A) = \frac{P(B_r)\cdot P(A|B_r)}{\sum^n_{i=1} P(B_i) \cdot P(A|B_i)}
\end{equation}
for r = 1,2,...,n.
Let’s see if we pull one more card out of our sleeve, excuse the horrible pun, to create one more example...
A card is lost from a pack of 52 cards, from the remaining cards two cards are randomly drawn and found to be hearts, what is the probability that the last card is also a heart.
We can define the following variables where the lost card can either be a heart, club, spade, or diamond so we can set up events B1, B2,B3, and B4 of losing a card of hearts, clubs, spades, and diamonds respectively. Here we can also state that A is the event of drawing two cards that are clubs after a card is lost.
We can then determine that \(P(B_1) = P(B_2) =P(B_3) =P(B_4) = \frac{1}{4}\).
We can also state that the probability of drawing two hearts given that the lost card is a heart \(P(A|B_1) = \frac{12}{51} \cdot \frac{11}{50}\).
The probability of drawing two hearts given the lost card is a club is \(P(A|B_2) = \frac{13}{51} \cdot \frac{12}{50}\). And we can also see that \(P(A|B_2) =P(A|B_3) =P(A|B_4)\).
Now we can plug into our equation and we find that
\begin{equation}
P(B_1 |A) = \frac{P(B_1) \cdot P(A|B_1)}{P(B_1) \cdot P(A|B_1)+P(B_2) \cdot P(A|B_2)+P(B_3) \cdot P(A|B_3)+P(B_4) \cdot P(A|B_4)}
\end{equation}
and we find the probability is 0.22!!!
Look forward to some more of these problems in your Pset.