You often have to compare data sets. For example, you need to evaluate if two teaching methods have similar results or if two different customer segments have the same income levels, etc. If you are working with two sets of data, you can test the difference by performing a hypothesis test. However, hypothesis tests cannot be used if you need to evaluate three or more sets of data.
This is where ANOVA tests – the analysis of variance test comes in. The term ANOVA is an acronym: ANalysis Of VAriance! ANOVA allows you to test if a parameter in more than three groups of variance is same.
MBA and CFA students will encounter ANOVA in statistics or other data analysis courses. We provide ANOVA tutoring on the different types of analysis of variances you will encounter in any business program. Do email or call us if you will benefit from tutoring on ANOVA.
One Way ANOVA vs. Completely Randomized Design
ANOVA can be done on one factor at a time on different groups – called one way ANOVA or completely randomized design. For example we make look at a pizza’s taste rating when made on different types of ovens. Here we are looking at one factor – taste rating across more than two groups or levels or categories. Or we make look at defect rates of a product made on different machines. Here again we are looking at one factor – defect rates across more than two groups or levels or categories. This kind of analysis of variance, where only one factor is evaluated, is called completely randomized design or One Way ANOVA. One way ANOVA or completely randomized design is used to test the:
- Statistical differences among the means of two or more groups
- Statistical differences among the means of two or more interventions
- Statistical differences among the means of two or more change scores
Often you will need to look at multiple factors at the same time. ANOVA, unlike hypothesis testing, can be done on multiple factors simultaneously. For example you not only want to evaluate class grades but also stress levels when different teaching techniques are being evaluated. Or you will want to look at defect rates and completion time when evaluating different production machines. Here it will be inefficient to evaluate one factor at a time and so will want to consider multiple factors simultaneously. ANOVA done on multiple factors is called factorial design.
F Test and F-distribution
You encountered the z distribution and the t distribution when setting up a confidence interval and the hypothesis distributions. When doing an analysis of variance, you will use the F-distribution. The F-distribution was discovered by Ronald Fisher and George W. Snedecor. It is also called the Snedecor’s F distribution or the Fisher–Snedecor distribution. F-distribution is a continuous probability distribution and looks like a normal distribution with a longer tail on one side!
Image: F distribution
One-Way ANOVA Assumptions
The One-Way ANOVA test is appropriate only if specific assumptions are met. If one or more of these assumptions are not satisfied, the ANOVA results are less reliable.
- The data is nominal. If the data is ordinal, non-parametric test should be used to analyze variance. Kruskal–Wallis one-way analysis of variance is an example of a non-parametric test.
- The data is continuous data. Remember that the F-distribution is a continuous probability distribution.
- The variances in the group should be equal (or close to equal). If the variances of the group are not equal, a One-Way ANOVA will not be appropriate. Here a 2-sample Welch’s t-test can be used.
- Each of the groups being analyzed must have 6 or more data points to get sufficient representation.
- The number of data points in each group must be equal ideally. Or approximately equal.
If any of the above assumptions are not met, a One-Way ANOVA tests result may not be reliable.
Process involved in conducting an Analysis of Variance (ANOVA testing)
The process involved in an ANOVA or analysis of variance is similar to what we do in a hypothesis test.
- You first set up a null hypothesis and an alternate hypothesis.
- You then generate a critical value which you use as a cut of point.
- You gather and analyze the data from samples.
- You then compute the F-statistic from the sample data.
- Compare the F-statistic with the critical value which you set in step 2.
- You will then decide if you can reject the null hypothesis ; or not!
- You will finally make a conclusion.
One-Way ANOVA: Null Hypothesis & Alternate Hypothesis
The null hypothesis in a One-Way ANOVA is simply that the means of each group are equal.
Image The null hypothesis and an alternate hypothesis One-Way ANOVA
The alternate hypothesis in a One-Way ANOVA that not all means of the groups are equal.
This is it. The null hypothesis and an alternate hypothesis in a One-Way ANOVA or a completely randomized design is fixed unlike in a hypothesis test where you have to design it specifically for the situation. Here you are only checking if the means of the different groups are equal.
The Critical Value of F
You need to decide on a critical value of F that you will use as a cut of point. The critical value of F is based on the probability you use as a cut off – also called a significance level. The probability that you are willing to accept as an error. It will depend on the data you are working with. Generally, a 5% significance level is used.
The Computed F Test Statistic
We gather samples from the different groups we are trying to evaluate. We use the sample data to compute the samples F test statistic in order to compare it with the critical value of F. There are a number of ingredients required to compute the F test statistic. Today a number of software tools will compute the F statistic for you but it is good to understand what goes into computing the F statistic.
The Sums of Squares Total: SST for the One-Way ANOVA
The sums of squares total is referred to as SST. The SST for a one-way ANOVA is the squared deviation of each value from the overall or grand total average. You subtract each value from every group from the grand mean and square it. You are squaring it to remove the negative values when the sample value is less than the grand total average.
The Sums of Squares Among Groups: SSA for the One-Way ANOVA
The sums of squares among groups is referred to as SSA. The SSA for the one-way ANOVA is the squared deviation of each group’s average from the overall or grand total average. Here you take subtract each groups average (not every value) from the grand mean and square it.
The Sums of Squares Within Groups: SSW for the One-Way ANOVA
The sums of squares within groups is referred to as SSW. The SSW for the one-way ANOVA is simply the squared deviation of each value in a group from the average of that group. We do this for each group and add up the totals. Here you take subtract each value of every group from the mean of that group. We then square that deviation and add it up for all the groups.
Degrees of Freedom for one-way ANOVA
Two aspects determine the degrees of freedom for one-way ANOVA.
- The number of levels in each factor; and
- Sample size n.
The degrees of freedom will vary based on the sum of squares you are computing. We will look at the different types of sum of squares next.
Mean Squares for One-Way ANOVA: MST, MSA & MSW
Mean Squares Total (MST)
The Mean Squares of the Total is referred to as MST. You take the SST arrived at above and divide it by the degrees of freedom associated with the SST which is n-1.
Image of F table terms
Mean Squares Among (MSA)
The Mean Squares of the among groups total is referred to as MSA. You take the SSA arrived at in step _ and divide it by the degrees of freedom associated with the SSA which is c-1.
Mean Squares Within (MSW)
The Mean Squares of the Within Groups total is referred to as MSW. You take the SSW arrived at in step _ and divide it by the degrees of freedom associated with the SSW which is n-c.
Once you have the MSA and MSW, you are ready to compute the F-test statistic. The F-test statistic is simply the MSA/MSW.
image F-test statistic
The F-test statistic follows an F distribution with a c-1 degree of freedom.
Diagram F distribution with reject region
The sums of squares SST and SSE previously computed for the one-way ANOVA
Conclusions from One-Way ANOVA
An One-Way ANOVA test can only tell you if at least one of the means is different from the others. It does not tell you which of the group’s means is different from the others. That is why the One-Way ANOVA test is also called an omnibus test. Remember omnibus in Latin means “all”.
Also remember that when we were interpreting regression output the F test and significance F indicated whether the overall model was significant. What you can conclude, here, in a One-Way ANOVA test is whether or not there is a significant differences in any of the means between the groups.
Note: You will have to preform a different set of tests called post hoc tests to know which of the sample group’s mean is significantly different. Post hoc in Latin means “after this”. Examples of Post hoc tests include Tukey’s HSD test, Fisher’s LSD test, Tukey-Kramer test, Duncan Multiple Range test, Dunnett’s Multiple Comparison test, Newman-Keuls test, Scheffe’s test, , Sidak, etc.
We provide ANOVA tutoring on the different types of analysis of variances you will encounter in any business program. Do email or call us if you will benefit from tutoring on ANOVA.