Degrees of Freedom
The degree of freedom is a statistical term referring to the number of independent values that can vary in a dataset after constraints have been applied.
Put simply, degrees of freedom tell us how many observations can freely change while keeping the totals or other fixed constraints constant.
It’s commonly abbreviated as df or dof, which stands for "degree of freedom".
The number of degrees of freedom varies depending on the statistical test being performed.
Degrees of Freedom in Independence Tests
In the chi-square test, degrees of freedom are determined based on the dimensions of the contingency table used to analyze the data.
The degrees of freedom for a contingency table are calculated using this formula:
$$ df = (r - 1) \times (c - 1) $$
Where \( r \) represents the number of categories (rows) of the first variable, and \( c \) represents the number of categories (columns) of the second variable.
This calculation shows the number of values that can vary independently while keeping the row and column totals unchanged.
For example, let’s consider a contingency table with 2 rows (Variable A with two categories) and 3 columns (Variable B with three categories).
Mathematics | Literature | Physics | Total | |
---|---|---|---|---|
Male | 20 | 15 | 10 | 45 |
Female | 10 | 25 | 20 | 55 |
Total | 30 | 40 | 30 | 100 |
The degrees of freedom are calculated as follows:
$$ df = (2 - 1) \times (3 - 1) = 1 \times 2 = 2 $$
In this example, the chi-square test has 2 degrees of freedom.
Why Are Degrees of Freedom Important?
Degrees of freedom are crucial because they affect the shape of the chi-square distribution, which is used to compare the calculated value against the critical value.
The chi-square distribution changes based on the degrees of freedom, and the critical value needed to determine statistical significance depends directly on this number.
Therefore, understanding degrees of freedom is key to accurately interpreting the results of statistical tests.
And so on.