What is ANOVA (Analysis of Variance)? Explained with Examples (ANOVA F-test)


What is ANOVA (Analysis of Variance)? Explained with Examples (ANOVA F-test)

In this article I'll walk you step‑by‑step through ANOVA (analysis of variance): what it is, why we use it, the main types, the assumptions you must check, how the F‑test works, and a worked example to make everything concrete.

Quick overview: What is ANOVA?

ANOVA stands for analysis of variance. It was developed by Ronald Fisher and allows us to test whether the means of three or more groups are statistically different. Think of ANOVA as an extension of the t‑test: a t‑test compares two means, while ANOVA compares multiple means in a single test.

When only two groups are involved, a t‑test and one‑way ANOVA give equivalent results. But when you have more than two groups, doing many pairwise t‑tests inflates the Type I error (false positive) rate. ANOVA avoids that by testing all group means together.

Why ANOVA (and not many t‑tests)?

  • Multiple t‑tests increase the probability of incorrectly rejecting a true null hypothesis (alpha inflates).
  • ANOVA tests whether there is a statistically significant difference among group means in one overall test, controlling the error rate.

Types of ANOVA

  • One‑way ANOVA: One factor (independent variable) with three or more levels (groups). Example: comparing exam scores from three study methods.
  • Two‑way ANOVA: Two independent factors are studied simultaneously (and you can test for an interaction between factors). Example: study method (A/B/C) and classroom type (online/in‑person).
  • Repeated measures ANOVA: Used when the same subjects are measured under different conditions or times (within‑subject design).

Core assumptions of ANOVA

  • The observations in each group are independently and randomly sampled.
  • The populations from which groups are drawn are approximately normally distributed.
  • Homogeneity of variances: each group has (approximately) the same variance.

Null and alternative hypotheses

The null hypothesis in ANOVA is that all group means are equal:

H0: μ1 = μ2 = μ3 = ... = μk

The alternative hypothesis is that at least one group mean differs from the others:

H1: Not all μi are equal (at least one μi is different)

The F‑test: how ANOVA decides significance

ANOVA uses the F statistic. Intuitively the F statistic compares the variance of group means (variance between groups) to the variance inside the groups (variance within groups):

F = variance between groups / variance within groups

If the between‑group variance is large relative to the within‑group variance, group means are spread out beyond what we'd expect from random variation alone — yielding a large F and a small p‑value (evidence against H0).

Variance between vs variance within
  • Variance between (SSB): How far each group mean is from the overall (grand) mean, weighted by group size. Large values suggest group means differ.
  • Variance within (SSE): How much observations vary around their own group mean (the "noise" inside groups).
  • Total variance (SST) = SSB + SSE.

Worked example: Do three study methods produce different mean scores?

Scenario: 30 students are randomly assigned to three study methods (A, B, C), 10 students per method. Group means are:

  • Method A mean = 8.7
  • Method B mean = 8.6
  • Method C mean = 8.5

Overall (grand) mean = average of the three group means = 8.6.

Step 1: Between‑group sum of squares (SSB)

SSB = Σ n_j (x̄_j − x̄_grand)^2

With n_j = 10 for each group:

  • SSB = 10*(8.7 − 8.6)^2 + 10*(8.6 − 8.6)^2 + 10*(8.5 − 8.6)^2 = 0.2 (as calculated in the example)
Step 2: Within‑group sum of squares (SSE)

SSE = Σ Σ (x_ij − x̄_j)^2 — sum of deviations of each observation from its group mean.

In the example the within‑group sums (for the three groups) were 6.6, 10.9, and 10.5, giving

SSE = 6.6 + 10.9 + 10.5 = 28

Step 3: Compute the F statistic

F (as used in the simple demonstration) = SSB / SSE = 0.2 / 28 = 0.0071

This F value is much less than 1, suggesting little evidence of between‑group differences relative to within‑group variability.

Step 4: Compare with critical F (or use p‑value)

Degrees of freedom:

  • Numerator df = a − 1 = 3 − 1 = 2
  • Denominator df = N − a = 30 − 3 = 27

From the F table at α = 0.05, F_critical for (2, 27) ≈ 3.35. Because F_stat (0.0071) < F_critical (3.35), we fail to reject H0. In other words, there is no evidence that the study methods produced different mean scores in this sample.

ANOVA bookkeeping: SS, df, MS, F, p (the ANOVA table)

A standard one‑way ANOVA table has rows for Between (Factor), Within (Error), and Total. Columns typically list:

  • SS — sum of squares (SSB, SSE, SST)
  • df — degrees of freedom (Between: a−1, Within: a(n−1), Total: N−1)
  • MS — mean square (MS = SS / df). MSB = SSB / (a−1); MSE = SSE / (N−a).
  • F — F = MSB / MSE
  • p‑value — probability of observing an F at least as extreme as the calculated value under H0

If all group means truly equal each other, MSB and MSE should be similar and F will be near 1. A large F indicates MSB ≫ MSE and suggests group differences.

Interpreting results and practical notes

  • Reject H0 if F > F_critical or p < α.
  • ANOVA tells you that at least one mean differs, but it does not tell you which groups are different. Use post‑hoc tests (Tukey, Bonferroni, etc.) to compare pairs.
  • Always check ANOVA assumptions; if homogeneity or normality is violated, consider transformations or nonparametric alternatives (e.g., Kruskal‑Wallis).

Quick quiz

  1. How is the significance of an ANOVA test determined: by calculating T statistics, F statistics, or chi‑square statistics?
  2. Analysis of variance is a statistical method of comparing: population means, variances, standardization, or none of the above?
  3. The sum of squares measures the variability of observed values around their respective: treatments, other, total, interaction?
Answers
  1. F statistics. ANOVA uses the F distribution.
  2. Population means. ANOVA compares the means of several populations.
  3. Treatments (or group means). Sum of squares (for factor/treatment) measures variability around group (treatment) means; total SS measures variability around the grand mean.

Conclusion

ANOVA is a powerful and widely used method to test whether multiple group means differ while controlling Type I error. Key takeaways:

  • Use one‑way ANOVA for a single factor with multiple levels; two‑way for two factors; repeated measures when the same subjects are measured repeatedly.
  • Understand the difference between variance between and variance within — ANOVA is fundamentally a ratio of these variances.
  • Check assumptions, compute SS → MS → F, then interpret using critical F or p‑value. If significant, follow up with post‑hoc tests to find where differences lie.