Frequency Tables in Statistics
In statistics, a frequency table is a tool used to organize and present data collected from a survey in a clear and systematic way.
This type of table categorizes the data into classes or groups and shows how often each class or group appears in the dataset, known as the frequency of occurrence.
A typical frequency table includes four main columns:
- Frequency Classes or Categories
These represent the different groups or intervals into which the data is divided.For example, if you were analyzing the heights of a group of people, the classes might be height ranges like "150-159 cm," "160-169 cm," and so on.
- Absolute Frequency (F)
This shows the number of observations, or occurrences, that fall into each class or category.Continuing with the height example, the absolute frequency for the "150-159 cm" class might be 12, meaning 12 people have a height between 150 and 159 cm.
- Relative Frequency (f)
This is the ratio of the absolute frequency of a class to the total number of observations. The sum of all relative frequencies in a table equals 1. $$ f = \frac{F}{T} $$ where T is the total number of observations, and F is the absolute frequency of a class. This value is often expressed as a percentage. $$ f = \frac{F}{T} \cdot 100 $$ In this case, the sum of all relative frequencies as percentages equals 100%.
For instance, if there are 100 people in total, and 12 of them fall into the "150-159 cm" class, the relative frequency would be 12%. $$ f = \frac{F}{T} \cdot 100 $$ Here, F=12 and T=100. $$ f = \frac{12}{100} \cdot 100 = 0.12 \cdot 100 = 12% $$
- Cumulative Frequency
The cumulative frequency is the running total of absolute and/or relative frequencies up to a certain class or category. It represents the total number of observations that fall within that class and all preceding classes. This helps in understanding how data accumulates across the distribution.
Frequency tables are incredibly useful for quickly analyzing data distributions and often serve as the basis for graphical representations like histograms or bar charts.
They also make it easy to identify the mode (the most frequent value) and provide an immediate snapshot of the data distribution within the sample.
Practical Example
I gathered data on the heights of 30 students, with the following results:
- 150-159 cm: 3 students
- 160-169 cm: 8 students
- 170-179 cm: 15 students
- 180-189 cm: 4 students
The frequency table could look like this:
Height (cm) | Absolute Frequency | Relative Frequency (%) | Cumulative Frequency | Cumulative Frequency (%) |
---|---|---|---|---|
150-159 | 3 | 10% | 3 | 10% |
160-169 | 8 | 26.7% | 11 | 36.7% |
170-179 | 15 | 50% | 26 | 86.7% |
180-189 | 4 | 13.3% | 30 | 100% |
Total | 30 | 100% |
The set of pairs consisting of the category (first column) and the corresponding absolute frequency (second column) is known as the frequency distribution.
For example, (150-159, 3), (160-169, 8), (170-179, 15), (180-189, 4)
The third column shows the cumulative absolute frequency.
For example, in the row for the "170-179 cm" category, the cumulative frequency is 26 because the sum of the absolute frequencies up to this category is $$ 3+8+15=26 $$
The fourth column shows the cumulative relative frequency, where the relative frequencies are added together.
For instance, in the row for the "170-179 cm" category, the cumulative relative frequency is 86.7% because the sum of the relative frequencies up to this category is $$ 10\% +26.7\% +50\% =86.7\% $$
In conclusion, frequency tables are a crucial component of statistical analysis, allowing raw data to be transformed into a more interpretable and visually accessible format.