13.12: Factor analysis and ANOVA
 Page ID
 22533
Authors: Alexander Voice, Andrew Wilkins, Rohan Parambi, Ibrahim Oraiqat
Stewards: Irene Brockman, Chloe Sweet, Rob Chockley, Scott Dombrowski
12.1 Introduction
First invented in the early 1900s by psychologist Charles Spearman, factor analysis is the process by which a complicated system of many variables is simplified by completely defining it with a smaller number of "factors." If these factors can be studied and determined, they can be used to predict the value of the variables in a system. A simple example would be using a person's intelligence (a factor) to predict their verbal, quantitative, writing, and analytical scores on the GRE (variables).
Analysis of variance (ANOVA) is the method used to compare continuous measurements to determine if the measurements are sampled from the same or different distributions. It is an analytical tool used to determine the significance of factors on measurements by looking at the relationship between a quantitative "response variable" and a proposed explanatory "factor." This method is similar to the process of comparing the statistical difference between two samples, in that it invokes the concept of hypothesis testing. Instead of comparing two samples, however, a variable is correlated with one or more explanatory factors, typically using the Fstatistic. From this Fstatistic, the Pvalue can be calculated to see if the difference is significant. For example, if the Pvalue is low (Pvalue<0.05 or Pvalue<0.01  this depends on desired level of significance), then there is a low probability that the two groups are the same. The method is highly versatile in that it can be used to analyze complicated systems, with numerous variables and factors. In this article, we will discuss the computation involved in SingleFactor, TwoFactor: Without Replicates, and TwoFactor: With Replicates ANOVA. Below, is a brief overview of the different types of ANOVA and some examples of when they can be applied.
12.1.1 Overview and Examples of ANOVA Types
ANOVA Types
SingleFactor ANOVA (OneWay):
Oneway ANOVA is used to test for variance among two or more independent groups of data, in the instance that the variance depends on a single factor. It is most often employed when there are at least three groups of data, otherwise a ttest would be a sufficient statistical analysis.
TwoFactor ANOVA (TwoWay):
Twoway ANOVA is used in the instance that the variance depends on two factors. There are two cases in which twoway ANOVA can be employed:
 Data without replicates: used when collecting a single data point for a specified condition
 Data with replicates: used when collecting multiple data points for a specified condition (the number of replicates must be specified and must be the same among data groups)
When to Use Each ANOVA Type
 Example: There are three identical reactors (R1, R2, R3) that generate the same product.
 Oneway ANOVA: You want to analyze the variance of the product yield as a function of the reactor number.
 Twoway ANOVA without replicates: You want to analyze the variance of the product yield as a function of the reactor number and the catalyst concentration.
 Twoway ANOVA with replicates: For each catalyst concentration, triplicate data were taken. You want to analyze the variance of the product yield as a function of the reactor number and the catalyst concentration.
ANOVA is a Linear Model
Though ANOVA will tell you if factors are significantly different, it will do so according to a linear model. ANOVA will always assumes a linear model, it is important to consider strong nonlinear interactions that ANOVA may not incorporate when determining significance. ANOVA works by assuming each observation as overall mean + mean effect + noise. If there are nonlinear relationship between these (for example, if the difference between column 1 and column 2 on the same row is that column2 = column1^2), then there is the chance that ANOVA will not catch it.
12.2 Key Terms
Before further explanation, please review the terms below, which are used throughout this Wiki.
12.3 Comparison of Sample Means Using the FTest
The FTest is the ratio of the sample variances. The Fstatistic and the corresponding FTest are used in singlefactor ANOVA for purposes of hypothesis testing.
Null hypothesis (H_{o}): all sample means arising from different factors are equal
Alternative hypothesis (H_{a}): the sample means are not all equal
Several assumptions are necessary to use the Ftest:
 The samples are independent and random
 The distribution of the response variable is a normal curve within each population
 The different populations may have different means
 All populations have the same standard deviation
12.3.1 Introduction to the FStatistic
The Fstatistic is the ratio of two variance estimates: the variance between groups divided by the variance within groups. The larger the Fstatistic, the more likely it is that the difference between samples is due to the factor being tested, and not just the natural variation within a group. A standardized table can be used to find F_{critical} for any system. F_{critical} will depend on alpha, which is a measure of the confidence level. Typically, a value of alpha = 0.05 is used, which corresponds to 95% confidence. If F_{observed} > F_{critical}, we conclude with 95% confidence that the null hypothesis is false. For an explanation of how to read an FTable, see Interpreting the Fstatistic (below). In a similar manner, F tables can also be used to determine the pvalue for a given data set. The pvalue for a given data set is the probability that you could obtain this data set if the null hypothesis were true: that is, if the results were strictly due to chance. When H_{o} is true, the Fstatistic has an F distribution.
12.3.2 FDistributions
The Fdistribution is important to ANOVA, because it is used to find the pvalue for an ANOVA Ftest. The Fdistribution arises from the ratio of two Chi squared distributions. Thus, this family has a numerator and denominator degrees of freedom. (For information on the Chi squared test, click here.) Every function of this family has a a skewed distribution and minimum value of zero.
Figure 1  F distribution with alpha and F_{critical} indicated
12.4 SingleFactor Analysis of Variance
In the case of singlefactor analysis, also called single classification or oneway, a factor is varied while observing the result on the set of dependent variables. These dependent variables belong to a specific related set of values and hence, the results are expected to be related.
This section will describe some of the computational details for the Fstatistic in oneway ANOVA. Although these equations provide insight into the concept of analysis of variance and how the Ftest is constructed, it is not necessary to learn formulas or to do this analysis by hand. In practice, computers are always used to do oneway ANOVA
12.4.1 Setting up an Analysis of Variance Table
The fundamental concept in oneway analysis of variance is that the variation among data points in all samples can be divided into two categories: variation between group means and variation between data points in a group. The theory for analysis of variance stems from a simple equation, stating that the total variance is equal to the sum of the variance between groups and the variation within groups –
Total variation = variation between groups + variation within groups
An analysis of variance table is used to organize data points, indicating the value of a response variable, into groups according to the factor used in each case. For example, Table 1 is an ANOVA table for comparing the amount of weight lost over a three month period by dieters on various weightloss programs.
Table 1  Amount of weight lost by dieters on various programs over a 3 month period
Program 1 
Program 2 
Program 3 
7 
9 
15 
9 
11 
12 
5 
7 
18 
7 
A reasonable question is, can the type of program (a factor) be used to predict the amount of weight a dieter would lose on that program (a response variable)? Or, in other words, is any one program superior to the others?
12.4.2 Measuring Variation Between Groups
The variation between group means is measured with a weighted sum of squared differences between the sample means and the overall mean of all the data. Each squared difference is multiplied by the appropriate group sample size, ni, in this sum. This quantity is called sum of squares between groups or SS Groups.
The numerator of the Fstatistic for comparing means is called the mean square between groups or MS Groups, and it is calculated as 
12.4.3 Measuring Variation Within Groups
To measure the variation among data points within the groups, find the sum of squared deviations between data values and the sample mean in each group, and then add these quantities. This is called the sum of squared errors, SSE, or sum of squares within groups.
Where = variance within each group
The denominator of the Fstatistic is called the mean square error, MSE, or mean squares within groups. It is calculated as
MSE is simply a weighted average of the sample variances for the k groups. Therefore, if all n_{i} are equal, MSE is simply the average of the k sample variances. The square root of MSE (s_{p}), called the pooled standard deviation, estimates the population standard deviation of the response variable (keep in mind that all of the samples being compared are assumed to have the same standard deviation σ).
12.4.4 Measuring the Total Variation
The total variation in all samples combined is measured by computing the sum of squared deviations between data values and the mean of all data points. This quantity is referred to as the total sum of squares or SS Total. The total sum of squares may also be referred to as SSTO. A formula for the sum of squared differences from the overall mean is
where x_{ij} represents the jth observation within the ith group, and is the mean of all observed data values. Finally, the relationship between SS Total, SS Groups, and SS Error is
SS Total = SS Groups + SS Error
Overall, the relationship between the total variation, the variation between groups, and the variation within a group is illustrated by Figure 2.
A general table for performing the oneway ANOVA calculations required to compute the Fstatistic is given below
Table 2  OneWay ANOVA Table
Source 
Degrees of Freedom 
Sum of Squares 
Mean Sum of Squares 
FStatistic 
Between groups 
k1 

Within groups (error) 
Nk 

Total 
N1 
12.4.5 Interpreting the Fstatistic
Once the Fstatistic has been found, it can be compared with a critical F value from a table, such as this one: F Table. This F table is calculated for a value of alpha = 0.05, indicating a 95% confidence level. This means that if F_{observed} is larger than F_{critical} from the table, then we can reject the null hypothesis and say with 95% confidence that the variance between groups is not due to random chance, but rather due to the influence of a tested factor. Tables are also available for other values of alpha and can be used to find a more exact probability that the difference between groups is (or is not) caused by random chance.
12.4.6 Finding the Critical F value
In this F Table, the first row of the F table is the number of degrees of between groups (number of groups  1), and the first column is the number of degrees of freedom within groups (total number of samples  number of groups).
For the diet example in Table 1, the degree of freedom between groups is (31) = 2 and and the degree of freedom within groups is (133) = 10. Thus, the critical F value is 4.10.
12.4.7 Computing the 95% Confidence Interval for the Population Means
It is useful to know the confidence interval at which the means of the different groups are reported. The general formula for calculating a confidence interval is . Because it is assumed that all populations have the same standard deviation can be used to estimate the standard deviation within each group. Although the population standard deviation is assumed to be the same, the standard error and the multiplier may be different for each group, due to differences in group size and degrees of freedom. The standard error of a sample mean is inversely proportional to the square root of the number of data points within the sample. It is calculated as . The multiplier is determined using a tdistribution where the degrees of freedom are calculated as df = Nk. Therefore,Insertformulahere the confidence interval for a population mean is . More details on confidence intervals can be found in Comparison of two means
An example for using factor analysis is the following:
You have two assembly lines. Suppose you sample 10 parts from the two assembly lines. Ho: s1^{2} = s2x^{2} Ha: variances are not equal Are the two lines producing similar outputs? Assume a=0.05 F_{.025,9,9 }= 4.03 F_{1.025,9,9} = ?
Are variances different?
How would we test if the means are different?
12.5 TwoFactor Analysis of Variance
A two factor or twoway analysis of variance is used to examine how two qualitative categorical variables (male/female) affect the mean of a quantitative response variable. For example, a psychologist might want to study how the type and volume of background music affect worker productivity. Alternatively, an economist maybe be interested in determining the affect of gender and race on mean income. In both of these examples, there is interest in the effect of each separate explanatory factor, as well as the combined effect of both factors.
12.5.1 Assumptions
In order to use the twoway ANOVA, the following assumptions are required:
 Samples must be independent.
 Population variances must be equal.
 Groups must have same sample size. The populations from which the samples were obtained must be normally distributed (or at least approximately so).
 The null hypothesis is assumed to be true.
The null hypothesis is as follows:
 The population means for the first factor have to be equal. This is similar to the oneway ANOVA for the row factor.
 The population means for the second factor must also be equal. This is similar to the oneway ANOVA for the column factor.
 There isn’t an interaction between the two factors. This is similar to performing an independence test using contingency tables.
More simply, the null hypothesis implies that the populations are all similar and any differences in the populations are caused by chance, not by the influence of a factor. After carrying out twoway ANOVA it will be possible to analyze the validity of this assumption.
12.5.2 Terms Used in TwoWay ANOVA
The interaction between two factors is the most unique part of a twoway analysis of variance problem. When two factors interact, the effect on the response variable depends on the value of the other factor. For example, the statement being overweight caused greater increases in blood pressure for men than women describes an interaction. In other words, the effect of weight (factor) on blood pressure (response) depends on gender (factor).
The term main effect is used to describe the overall effect of a single explanatory variable. In the music example, the main effect of the factor "music volume" is the effect on productivity averaged over all types of music. Clearly, the main effect may not always be useful if the interaction is unknown.
In a twoway analysis of variance, three Fstatistics are constructed. One is used to test the statistical significance of the interaction, while the other two are used to test the significance of the two separate main effects. The pvalue for each Fstatistic is also reporteda pvalue of <.05 is usually used to indicate significance. When an Ffactor is found to have statistical significance, it is considered a main effect. The pvalue is also used as an indicator to determine if the two factors have a significant interaction when considered simultaneously. If one factor depends strongly on the other, the Fstatistic for the interaction term will have a low pvalue. An example output of twoway analysis of variance of restaurant tip data is given in Table 4.
Table 4  TwoWay Analysis of Variance of Restaurant Tipping Data
Source 
DF 
Adj SS 
Adj MS 
FStatistic 
PValue 
Message 
1 
14.7 
14.7 
.13 
.715 
Sex 
1 
2602.0 
2602.0 
23.69 
0.00 
Interaction 
1 
438.7 
438.7 
3.99 
.049 
Error 
85 
9335.5 
109.8 

Total 
88 
12407.9 
In this case, the factors being studied are sex (male or female) and message on the receipt ( :) or none). The pvalues in the last column are the most important information contained in this table. A lower pvalue indicates a higher level of significance. Message has a significance value of .715. This is much greater than .05, the 95% confidence interval, indicating that this factor has no significance (no strong correlation between presence of message and amount of tip). The reason this occurs is that there is a relationship between the message and the sex of the waiter. The interaction term, which was significant with a value of p= 0.049, showed that drawing a happy face increased the tip for women but decreased it for men. The main effect of waiter sex (with a pvalue of approximately 0) shows that there is a statistical difference in average tips for men and women.
12.5.3 TwoWay ANOVA Calculations
Like in oneway ANOVA analysis the main tool used is the square sums of each group. Twoway ANOVA can be split between two different types: with repetition and without repetition. With repetition means that every case is repeated a set number of times. For the above example that would mean that the :) was given to females 10 times and males 10 times, and no message was given to females 10 times and males 10 times
Using the SS values as a start the Fstatistics for twoway ANOVA with repetition are calculated using the chart below where a is the number of levels of main effect A, b is the number of levels of main effect B, and n is the number of repetitions.
Source 
SS 
DF 
Adj MS 
FStatistic 
Main Effect A 
From data given 
a1 
SS/df 
MS(A)/MS(W) 
Main Effect B 
From data given 
b1 
SS/df 
MS(B)/MS(W) 
Interaction Effect 
From data given 
(a1)(b1) 
SS/df 
MS(A*B)/MS(W) 
Within 
From data given 
ab(n1) 
SS/df 

Total 
sum of others 
abn1 
Without repetition means there is one reading for every case. For example is you were investigating whether or not difference in yield are more significant based on the day the readings were taken or the reactor that the readings were taken from you would have one reading for Reactor 1 on Monday, one reading for Reactor 2 on Monday etc... The results for twoway ANOVA without repetition is slightly different in that there is no interaction effect measured and the within row is replaced with a similar (but not equal) error row. The calculations needed are shown in the table below.
Source 
SS 
DF 
MS 
FStatistic 
Main Effect A 
From data given 
a1 
SS/df 
MS(A)/MS(E) 
Main Effect B 
From data given 
b1 
SS/df 
MS(B)/MS(E) 
Error 
From data given 
(a1)(b1) 
SS/df 

Total 
sum of others 
ab1 
These calculations are almost never done by hand. In this class you will usually use Excel or Mathematica to create these tables. Sections describing how to use these programs are found later in this chapter.
12.6 Other Methods of Comparison
Unfortunately, the conditions for using the ANOVA Ftest do not hold in all situations. In this section, several other methods which do not rely on equal population standard deviations or normal distribution. It is important to realize that no method of factor analysis is appropriate if the data given is not representative of the group being studied.
12.6.1 Hypotheses About Medians
In general, it is best construct hypotheses about a population median, rather than the mean. Using the median accounts for the sample being skewed based on extreme outliers. Median hypotheses should also be used for dealing with ordinal variables (variables which are described only as being higher or lower than one other and do not have a precise value). When several populations are compared, the hypotheses are stated as 
H_{0}: Population medians are equal
H_{a}: Population medians are not all equal
12.6.2 KruskalWallis Test for Comparing Medians
The KruskalWallis Test provides a method of comparing medians by comparing the relative rankings of data in the observed samples. This test is therefore referred to as a rank test or nonparametric test because the test does not make any assumptions about the distribution of data.
To conduct this test, the values in the total data set are first ranked from lowest to highest, with 1 being lowest and N being highest. The ranks of the values within each group are averaged, and the test statistic measures the variation among the average ranks for each group. A pvalue can be determined by finding the probability that the variation among the set of rank averages for the groups would be as large or larger as it is if the null hypothesis is true. More information on the KruskalWallis test can be found [here].
12.6.3 Mood's Median Test for Comparing Medians
Another nonparametric test used to compare population medians is Mood's Median Test. Also called the Sign Scores Test, this test involves multiple steps.
1. Calculate the median (M) using all data points from every group in the study
2. Create a contingency table as follows
A 
B 
C 
Total 

Number of values greater than M 

Number of values less than or equal to M 
3. Calculate the expected value for each data set using the following formula:
4. Calculate the chisquare value using the following formula
A chisquare statistic for twoway tables is used to test the null hypothesis that the population medians are all the same. The test is equivalent to testing whether or not the two variables are related.
12.7 ANOVA and Factor Analysis in Process Control
ANOVA and factor analysis are typically used in process control for troubleshooting purposes. When a problem arises in a process control system, these techniques can be used to help solve it. A factor can be defined as a single variable or simple process that has an effect on the system. For example, a factor can be temperature of an inlet stream, flow rate of coolant, or the position of a specific valve. Each factor can be analyzed individually to determine the effect that changing the input has on the process control system as a whole. The input variable can have a large, small, or no effect on what is being analyzed. The amount that the input variable affects the system is called the “factor loading”, and is a numerical measure of how much a specific variable influences the system or the output variable. In general, the larger the factor loading is for a variable the more of an affect that it has on the output variable.
A simple equation for this would be:
Output = f_{1} * input_{1} + f_{2} * input_{2} + ... + f_{n} * input_{n}
Where f_{n} is the factor loading for the n^{th} input.
Factor analysis is used in this case study to determine the fouling in an alcohol plant reboiler. This article provides some additional insight as to how factor analysis is used in an industrial situation.
12.8 Using Mathematica to Conduct ANOVA
Mathematica can be used for oneway and twoway factor anaylses. Before this can be done, the ANOVA package must be loaded into Mathematica using the following command:
Needs["ANOVA`"]
Once this command is executed, the 'ANOVA' command can be utilized.
12.8.1 OneWay Factor Analysis
The basic form of the 'ANOVA' command to perform a oneway factor analysis is as follows:
ANOVA[data]
An example set of data with five elements would look like:
ANOVA[
Callstack:
at (Bookshelves/Industrial_and_Systems_Engineering/Book:_Chemical_Process_Dynamics_and_Controls_(Woolf)/13:_Statistics_and_Probability_Background/13.12:_Factor_analysis_and_ANOVA), /content/body/div[8]/div[1]/p[3]/span, line 1, column 2
An output table that includes the degrees of freedom, sum of the squares, mean sum of the squares, Fstatistic, and the Pvalue for the model, error, and total will be displayed when this line is executed. A list of cell means for each model will be displayed beneath the table.
12.8.2 TwoWay Factor Analysis
The basic form of the 'ANOVA' command to perform a twoway factor analysis is as follows:
ANOVA[data, model, vars]
An example set of data with seven elements would look like:
ANOVA[
Callstack:
at (Bookshelves/Industrial_and_Systems_Engineering/Book:_Chemical_Process_Dynamics_and_Controls_(Woolf)/13:_Statistics_and_Probability_Background/13.12:_Factor_analysis_and_ANOVA), /content/body/div[8]/div[2]/p[3]/span, line 1, column 2
An output table will appear similar to the one that is displayed in the oneway analysis except that there will be a row of statistics for each variable (i.e. x,y).
12.9 ANOVA in Microsoft Excel 2007
In order to access the ANOVA data analysis tool, install the package:
1. Click on the Microsoft Office button (big circle with office logo)
2. Click 'Excel Options'
3. Click 'Addins' on the left side
4. In the manage dropdown box at the bottom of the window, select 'Excel Addins'
5. Click 'Go...'
6. On the AddIns window, check the Analysis ToolPak box and click 'OK'
To use this package:
1. Click on the 'Data' tab and select 'Data Analysis'
2. Choose the desired ANOVA type 'Anova: Single Factor', 'Anova: Two Factor with Replication', or 'Anova: Two Factor without Replication'(see note below for when to use replication)
3. Select the desired data points including data labels at top of the corresponding columns. Make sure the box is checked for 'Labels in first row' in the ANOVA parameter window.
4. Specify alpha in the ANOVA parameter window. Alpha represents the level of significance.
5. Output the results into a new worksheet.
NOTE: Anova: Two Factor with Replication is used in the cases where there are multiple readings for a single factor. For instance, the input below, there are 2 factors, control architecture and unit. This input shows how there are 3 readings corresponding to each control architecture (FB, MPC, and cascade). In this sense, the control architecture is replicated 3 times, each time providing different data relating to each unit. So, in this case, you would want to use the Anova Two Factor with Replication option.
Anova: Two Factor without Replication is used in cases where there is only one reading pertaining to a particular factor. For example, in the case below, each sample (row) is independent of the other samples since they are based on the day they were taken. Since multiple readings were not taken within the same day, the "without Replication" option should be chosen.
Excel outputs:
Summary:
1. Count number of data points in a set
2. Sum sum of the data points in a set
3. Average mean of the data points in a set
4. Variance standard deviation of the data points in a set
ANOVA:
1. Sum of squares (SS)
2. The degree of freedom (df)
3. The mean squares (MS)
4. Fstatistic (F)
5. Pvalue
6. F_{critical}
See the figure below for an example of the inputs and outputs using Anova: Single Factor. Note the location of the Data Analysis tab. The data was obtained from the dieting programs described in Table 1. Since the Fstatistic is greater than F_{critical}, the null hypothesis can be rejected at a 95% confidence level (since alpha was set at 0.05). Thus, weight loss was not random and in fact depends on diet type chosen.
12.10 Worked out Example 1
Determine the fouling rate of the reboiler at the following parameters:
T = 410K
C_{c} = 16.7g / L
R_{T} = 145min
Which process variable has the greatest effect (per unit) on the fouling rate of the reboiler?
Note that the tables below are made up data. The output data for a single input was gathered assuming that the other input variables provide a negligible output. Although the factors that affect the fouling of the reboiler are similar to the ones found the the article linked in the "ANOVA and Factor Analysis in Process Controls" section, the data is not.
Temperature of Reboiler (K) 
400 
450 
500 
Fouling Rate (mg/min) 
0.8 
0.86 
0.95 
Catalyst Concentration (g/L) 
10 
20 
30 
Fouling Rate (mg/min) 
0.5 
1.37 
2.11 
Residence Time (min) 
60 
120 
180 
Fouling Rate (mg/min) 
0.95 
2.3 
3.81 
Solution:
1) Determine the "factor loading" for each variable.
This can be done using any linearization tool. In this case, the factor loading is just the slope of the line for each set of data. Using Microsoft Excel, the equations for each set of data are the following:
Temperature of Reboiler
y = 0.0015 * x + 0.195
Factor loading: 0.0015
Catalyst Concentration
y = 0.0805 * x − 0.2833
Factor loading: 0.0805
Residence Time
y = 0.0238 * x − 0.5067
Factor loading: 0.0238
2) Determine the fouling rate for the given process conditions and which process variable affects the fouling rate the most (per unit). Note that the units of the factor loading value are always the units of the output divided by the units of the input.
Plug in the factor loading values into the following equation:
Output = f_{1} * input_{1} + f_{2} * input_{2} + ... + f_{n} * input_{n}
You will end up with:
FoulingRate = 0.0015 * T + 0.0805 * C_{c} + 0.0238 * R_{T}
Now plug in the process variables:
FoulingRate = 0.0015 * 410 + 0.0805 * 16.7 + 0.0238 * 145
FoulingRate = 5.41mg / min
The process variable that affects the fouling rate the most (per unit) is the catalyst concentration because it has the largest factor loading value.
12.11 Worked out Example 2
Problem:
The exit flow rate leaving a tank is being tested for 3 cases. The first case is under the normal operating conditions, while the second (A) and the third (B) cases are for new conditions that are being tested. The flow value of 7 (gallons /hour) is desired with a maximum of 10. A total of 24 runs are tested with 8 runs for each case. The tests are run to determine whether any of the new conditions will result in a more accurate flow rate. First, we determine if the new conditions A and B affect the flow rate. The results are as follows:
The recorded values for the 3 cases are tabulated. Following this the values for each case are squared and the sums for all of these are taken. For the 3 cases, the sums are squared and then their means are found.
These values are used to help determine the table above (the equations give an idea as to how they are calculated). In the same way with the help of ANOVA, these values can be determine faster. This can be done using the mathematica explained above.
Conclusion:
F_{critical} equals 3.4668, from an Ftable. Since the calculated F value is greater than F_{critical}, we know that there is a statistically significant difference between 2 of the conditions. Thus, the null hypothesis can be rejected. However we do not know between which 2 conditions there is a difference. A posthoc analysis will help us determine this. However we are able to confirmed that there is a difference.
12.12 Worked out Example 3
As the new engineer on site, one of your assigned tasks is to install a new control architecture for three different units. You test three units in triplicate, each with 3 different control architecture: feedback (FB), model predictive control (MPC) and cascade control. In each case you measure the yield and organize the data as follows:
Do the units differ significantly? Do the control architectures differ significantly?
Answer: This problem can be solved using ANOVA Two factor with replication analysis.
12.13 Multiple Choice Question 1
ANOVA analysis works better for?
A. Nonlinear models
B. Linear models
C. Exponential models
D. All of the above
12.14 Multiple Choice Question 2
TwoWay ANOVA analysis is used to compare?
A. Any two sets of data
B. Two OneWay ANOVA models to each other
C. Two factors on their effect of the output
D. B and C
12.15 Multiple Choice Answers
Question 1: B
Question 2: C
12.16 Sage's Corner
If you are a sage for this page, please link your narrated powerpoint presentation here.
For just the slides: controls.engin.umich.edu/wiki/index.php/Image:Anchor2.ppt
video.google.com/googleplayer.swf?docId=3237677123667838406
video.google.com/googleplayer.swf?docId=6201287739553253294
For just the slides: controls.engin.umich.edu/wiki/index.php/Image:ANOVA_Slides.ppt
www.youtube.com/v/fA7xPxbM6NY
For just the slides: controls.engin.umich.edu/wiki/index.php/Image:Mnanovapresentation.ppt
www.youtube.com/v/AEbYrMZEyac
12.17 References
 Ogunnaike, Babatunde and W. Harmon Ray. Process Dynamics, Modeling, and Control. Oxford University Press. New York, NY: 1994.
 Uts, J. and R. Hekerd. Mind on Statistics. Chapter 16  Analysis of Variance. Belmont, CA: Brooks/Cole  Thomson Learning, Inc. 2004.
 Charles Spearman. Retrieved November 1, 2007, from www.indiana.edu/~intell/spearman.shtml
 Plonsky, M. "One Way ANOVA." Retrieved November 13, 2007, from www.uwsp.edu/psych/stat/12/anova1w.htm
 Ender, Phil. "Statistical Tables F Distribution." Retrieved November 13, 2007, from www.gseis.ucla.edu/courses/help/dist3.html
 Devore, Jay L. Probability and Statistics for Engineering and the Sciences. Chapter 10  The Analysis of Variance. Belment, CA: Brooks/Cole  Thomson Learning, Inc. 2004.
 "Mood's Median Test (Sign Scores Test)" Retrieved November 29, 2008, from www.micquality.com/six_sigma_glossary/mood_median_test.htm