Deciles
What are deciles?
Deciles are nine positional indices (quantiles) that divide a statistical distribution into ten equal parts.
Each part contains the same number of elements.
There are nine deciles:
- The first decile (D1) contains the first 1/10 (10%) of the elements from the left of the distribution.
- The second decile (D2) contains the first 2/10 (20%) of the elements from the left of the distribution.
- The third decile (D3) contains the first 3/10 (30%) of the elements from the left of the distribution.
- The fourth decile (D4) contains the first 4/10 (40%) of the elements from the left of the distribution.
- The fifth decile (D5) contains the first 5/10 (50%) of the elements from the left of the distribution.
- The sixth decile (D6) contains the first 6/10 (60%) of the elements from the left of the distribution.
- The seventh decile (D7) contains the first 7/10 (70%) of the elements from the left of the distribution.
- The eighth decile (D8) contains the first 8/10 (80%) of the elements from the left of the distribution.
- The ninth decile (D9) contains the first 9/10 (90%) of the elements from the left of the distribution.
Example. In this series of 40 elements, the nine deciles divide the ordered distribution into ten equal parts, with each part containing four elements.
How to Calculate Deciles
There are two methods to calculate deciles, depending on whether the distribution X is a set of values or a frequency distribution.
For a set of values
To calculate deciles for a series of values:
- Sort the values in ascending order.
- Multiply the total number of elements by p=1/10 for D1, p=2/10 for D2, and so on up to D10 $$ k = n \cdot p $$
- Determine the position of the decile:
- If k is an integer, the decile is the average of the k-th and (k+1)-th elements of the distribution.
- If k is not an integer, round it up to the nearest whole number. The decile is the value at the k-th position.
For frequency distributions
To calculate deciles for a frequency distribution:
- Calculate the cumulative absolute frequencies for each class of the distribution.
- Divide the total cumulative frequencies by 1/10, 2/10, 3/10, and so on up to 9/10. This gives you the positions of the deciles (D1, D2, D3, ..., D9) within the cumulative frequencies.
- Identify the intervals in the cumulative frequencies that correspond to the deciles D1, D2, D3, ..., D9. The respective classes in the frequency distribution represent the deciles.
Note. There are several methods to determine decile values. For instance, you can use the median value of a class as the decile. For more precision, deciles can also be calculated using linear interpolation.
A Practical Example
Example 1
Let's consider a distribution with n=9 elements:
$$ X = \{ 9,6,11,8,4,7,10,3,5 \} $$
We sort the values in ascending order:
$$ X = \{ 3,4,5,6,7,8,9,10,11 \} $$
To calculate the first decile (D1), multiply the total number of elements (n=9) by 1/10:
$$ k = n \cdot \frac{1}{10} = 9 \cdot \frac{1}{10} =0.9 $$
The result is a non-integer k=0.9.
So, we round up the position of the first decile to the nearest whole number (k=1).
$$ X = \{ \color{red}{3},4,5,6,7,8,9,10,11 \} $$
The first element (k=1) in the series is 3.
$$ D_1 = 3 $$
Thus, the first decile of the distribution X is D1=3.
$$ X = \{ \underbrace{3}_{D_1},4,5,6,7,8,9,10,11 \} $$
To calculate the fourth decile (D4), multiply the number of elements (n=9) by 4/10:
$$ k = n \cdot \frac{4}{10} = 9 \cdot \frac{4}{10} =3.6 $$
The result is a decimal k=3.6.
So, we round the position up to k=4.
$$ X = \{ 3,4,5, \color{red}{6},7,8,9,10,11 \} $$
The fourth element (k=4) in the series is 6.
$$ D_4 = 6 $$
Thus, the fourth decile of the series is D4=6.
$$ X = \{ 3,4,5,\underbrace{6}_{D_4},7 ,8,9,10,11 \} $$
To calculate the seventh decile (D7), multiply the number of elements in the series (n=9) by 7/10:
$$ k = n \cdot \frac{7}{10} = 9 \cdot \frac{7}{10} =6.3 $$
The result is a decimal k=6.3.
So, we round the position up to k=7.
$$ X = \{ 3,4,5,6,7,8,\color{red}{9},10,11 \} $$
The seventh element (k=7) is 9.
$$ D_7 = 9 $$
Thus, the seventh decile of the series is D7=9.
$$ X = \{ 3,4,5,6,7,8,\underbrace{9}_{D_7},10,11 \} $$
The other deciles in the series can be calculated in the same way.
Note. In this case, the deciles D1=3, D4=6, and D7=9 are approximate values that belong to the distribution X. This may not always be the case.
Example 2
This distribution consists of n=8 elements:
$$ X = \{ 9,6,8,4,7,10,3,5 \} $$
We sort the values of X in ascending order:
$$ X = \{ 3,4,5,6,7,8,9,10 \} $$
To calculate the fifth decile (D5), multiply the number of elements (n=8) by 5/10:
$$ k = n \cdot \frac{5}{10} = 8 \cdot \frac{5}{10} =4 $$
The product k=4 is an integer.
Therefore, calculate the average between the value at position k=4 and the value at position k+1=5:
$$ X = \{ 3,4,5,\color{red}6, \color{red}7,8,9,10 \} $$
The fourth element (k=4) is 6, and the fifth element (k=5) is 7.
Thus, the fifth decile of distribution X is D5=6.5.
$$ D_5 = \frac{6+7}{2} = 6.5 $$
Note. In this case, the decile is a value that does not belong to the distribution X.
Example 3
Let's consider this frequency distribution:
This table shows the exam grades of 40 students.
The variable being measured is the grade, ranging from 18 to 30, while the absolute frequencies represent the number of students.
To find the deciles, we add a column for cumulative frequencies:
The total cumulative frequency is ftot=40.
To find the third decile, multiply the total cumulative frequency ftot=40 by 3/10:
$$ k =f_{tot} \cdot \frac{3}{10} = 40 \cdot \frac{3}{10} = 12 $$
The result, 12, falls within the 9-13 interval of the cumulative frequencies.
Thus, the third decile is in the class D3=21.
To find the sixth decile, multiply the total cumulative frequency ftot=40 by 6/10:
$$ k =f_{tot} \cdot \frac{6}{10} = 40 \cdot \frac{6}{10} = 24 $$
The result, 24, falls within the 22-30 interval of the cumulative frequencies.
Thus, the sixth decile is in the class D6=25.
To find the eighth decile, multiply the total cumulative frequency ftot=40 by 8/10:
$$ k =f_{tot} \cdot \frac{8}{10} = 40 \cdot \frac{8}{10} = 32 $$
The result, 32, falls within the 30-34 interval of the cumulative frequencies.
Thus, the eighth decile is in the class D8=26.
Note. When the 40 grades are arranged from lowest to highest, the nine deciles D1, D2, D3, ..., D9 represent the points that divide the series into ten equal parts.
Example 4
This frequency distribution is divided into classes:
We add a column for cumulative absolute frequencies:
The total cumulative frequency is ftot=40.
To find the third decile D3, multiply the total cumulative frequency ftot=40 by 3/10:
$$ k =f_{tot} \cdot \frac{3}{10} = 40 \cdot \frac{3}{10} = 12 $$
The result, 12, falls within the cumulative frequency range of 9 to 16 for the 21-22 class.
To find the exact value of the decile, we use linear interpolation:
$$ D_3 = x_{inf} + (x_{sup} - x_{inf}) \cdot \frac{ c - n_{prec} }{n_{classe}} $$
Here’s what the terms mean:
- xinf=21 and xsup=22 are the boundaries of the 21-22 class.
- c=12 is the position of the third decile.
- nclasse=7 is the frequency of the 21-22 class.
- nprec=9 is the cumulative frequency of the previous classes.
Now, we substitute the values and do the calculations:
$$ D_3 = 21 + (22 - 21) \cdot \frac{ 12 - 9 }{7} $$
$$ D_3 = 21 + 1 \cdot \frac{ 3 }{7} $$
$$ D_3 = 21.42 $$
Thus, the third decile is D3=21.42.
Note. Alternatively, you could estimate the decile using the median value of the class. In this case, the middle value of the 21-22 class is 21.5. Therefore, the approximate value of the third decile is D3=21.5. While this is an approximate value, it is less precise than linear interpolation but quicker and easier to calculate.
To calculate the sixth decile D6, multiply the total cumulative frequency ftot=40 by 6/10:
$$ k =f_{tot} \cdot \frac{6}{10} = 40 \cdot \frac{6}{10} = 24 $$
The result, 24, falls within the cumulative frequency range of 16 to 30 for the 23-25 class.
We use linear interpolation to find the exact value of the sixth decile:
$$ D_6 = x_{inf} + (x_{sup} - x_{inf}) \cdot \frac{ c - n_{prec} }{n_{classe}} $$
Here’s what the terms mean:
- xinf=23 and xsup=25 are the boundaries of the 23-25 class.
- c=24 is the position of the sixth decile.
- nclasse=14 is the frequency of the 23-25 class.
- nprec=16 is the cumulative frequency of the previous classes.
Now, we substitute the values and do the calculations:
$$ D_6 = 23 + (25 - 23) \cdot \frac{ 24 - 16 }{14} $$
$$ D_6 = 23 + 2 \cdot \frac{ 8 }{14} $$
$$ D_6 = 24.14 $$
Thus, the sixth decile is D6=24.14.
To calculate the eighth decile D8, multiply the total cumulative frequency ftot=40 by 8/10:
$$ k =f_{tot} \cdot \frac{8}{10} = 40 \cdot \frac{8}{10} = 32 $$
The result, 32, falls within the cumulative frequency range of 30 to 39 for the 26-28 class.
We use linear interpolation to calculate the exact value of the eighth decile:
$$ D_8 = x_{inf} + (x_{sup} - x_{inf}) \cdot \frac{ c - n_{prec} }{n_{classe}} $$
Here’s what the terms mean:
- xinf=26 and xsup=28 are the boundaries of the 26-28 class.
- c=32 is the position of the eighth decile.
- nclasse=9 is the frequency of the 26-28 class.
- nprec=30 is the cumulative frequency of the previous classes.
Now, we substitute the values and do the calculations:
$$ D_8 = 26 + (28 - 26) \cdot \frac{ 32 - 30 }{9} $$
$$ D_8 = 26 + 2 \cdot \frac{ 2 }{9} $$
$$ D_8 = 26.44 $$
Thus, the eighth decile is D8=26.44.
And so on.