Measures of variability play an important role in our life isn’t it? Let me explain it with an example.
Suppose two pizza restaurants advertise that they can deliver your pizza within an average of 20 minutes.
Doesn’t it sound good!!! But when you are hungry, you might get confused about what would be the best choice to order your pizza.
Now, this is the time when you need to consider both restaurants’ variability. Do you not have any idea regarding measures of variability?
Do not worry; I have listed all the necessary details that help you to understand what measures of variability and how to calculate it.
Moreover, below I have given the solution for how you can determine the best restaurant to get your favorite pizza.
So, without confusing you more, let’s get started with the new concept of statistics.
What are measures of variability?
The measure of variability is the statistical summary, which represents the dispersion within the datasets. On the other hand, the measure of central tendency defines the standard value.
Statisticians use measures of variability to check how far the data points are going to fall from the given central value. That is why statisticians consider variability to get the distribution of the values.
The lower dispersion value shows the data points will be grouped nearer to the center.
The higher dispersion value shows the data points will be clustered further away from the center.
Does variability really matter?
Yes, it matters!!
The lower variability considers being ideal as it provides better predictions related to the population. In contrast, the higher variability value considers to be less consistent. This will lead to making predictions much harder.
Moreover, it has also been seen that the data sets might have a similar central tendency, but the variability level can be different or vice versa.
Suppose you have the value of variability or central tendency only; you cannot say the things about other aspects. Now, both of the terms can help you to get a clear picture of your data.
What is the use of measures of variability?
It has been noticed that variability lies everywhere. Suppose you ordered your favorite cuisine at a restaurant repeatedly but not at the same each time.
Now, you might find the assembly line might seem to be similar, but actually, it has different widths and lengths. This is where you need to apply the concept of variability to identify which would be the best assembly line to get your order faster.
Apart from this, some variation degrees are unavoidable as the inconsistency might create the problem. How?
Suppose you take a longer time than the average time; then you might get late for work. If your pizza tastes much different from the previous one, then you might not order it again. This is how you can use the concept of measures of variability.
What are the 4 measures of variability?
It is used to know about the spread of the data from the least to the most value within the distribution. Additionally, it considers being the easiest measures of variability to calculate.
Subtract the least value from the greatest value of the given dataset.
Let’s take an example to understand it:
Suppose you have 5 data points as:
It is clear that 40 is the highest value and 5 is the lowest value. Therefore,
=> R = H-L => 40-5 => 5
The range of the data is 5 minutes.
|Note: As you can see, here, 2 numbers are being used; therefore, the outliers can influence the range. Moreover, the range does not give information about value distribution. |
To get accurate results, combine the range with other measures.
The IQR (interquartile range) provides the middle spread of the distribution. For each distribution, the IQR includes half of the value.
Therefore, it is calculated by third quartile minus first quartile.
Let’s take an example of it:
Suppose you need to calculate an IQR of 8 data points. Therefore, first, get the Q3 & Q1 value. Then multiply Q3 with 0.75 and Q1 with 0.25.
Q1 = 0.25*8 = 2
Q3 = 0.75*8 = 6
It is clear that Q1 is 110 and Q3 is 287. Now, the IQR will be:
=> 287 – 110 = 177
|Note: Just as that of range, IQR uses two values for the calculation. But IQR gets less effect with the outliers. In addition to this, IQR provides consistent variability for normal and skewed distribution.|
The SD is the mean of variability that tells how far the score is from the average. It means the more the SD, the more variable data set would be.
Use the following formulas to calculate the standard deviation of the data set.
Follow the steps to calculate SD.
|Write the score to calculate the average.|
|Subtract the average from the score individually to get the deviation from the average.|
|Square each deviation and sum all deviations.|
|Divide the addition by N (for the population) or n -1 (for the sample).|
|Calculate the square root of the value to get the standard deviation’s value.|
Let’s take an example of it:
Suppose you have 5 data points, and you have to calculate SD.
|Data||Deviation from average||Squared Deviation||Divide the addition||Standard Deviation|
Average = 70
|70 -70 = 0|
110 – 70 = 40
50 – 70 = (-20)
20 – 70 = (-50)
100 – 70 = 30
Average of the square = 5400
|As we are dealing with the sample, we need to use n – 1.|
n – 1 = 5 – 1 => 4
5400/4 => 1350
|s = √1350 = 36.74|
The standard deviation of the given data is 36.74.
It implies that score deviation away from the 36.74 points.
It is the mean of squared deviation from the average. Also, variance is the standard deviation’s square. It is important to note that variance is quite harder to interpret.
The variance shows the degree of spread within the data sets. The larger the variance, the larger the data spread.
Following are the formulas to calculate the variance.
Let’s take an example of it to understand:
Consider the above example of standard deviation. Square the standard deviation value as:
s = 36.74
variance = (s)^2
36.74 * 36.74 = 1350.
|Note: Perform the steps of standard deviation (except the final step) to calculate the variance.|
Now, let’s get the answer to the pizza delivery question (discussed at the starting of the blog)!!!
In the starting, we have viewed that two pizza restaurants advertise that they can deliver the pizza in 20 minutes. But how can you find out the best one?
Here, we calculate the measures of variability for every point and analyze that the variabilities are different. Now, the below graph shows delivery time’s distribution.
The restaurant variable, which has more variable delivery, will represent a broader distribution curve.
From the graph, it is clear that delivery of 30 minutes or longer is unacceptable. After all, we are hungry!! In the graph, the shaded portion shows the delivery time proportion.
Almost 16% of deliveries (Restaurant 1) exceed the 30-minute delivery. Moreover, 2% delivery (Restaurant 2) is longer and has a lower variability restaurant. Both restaurants have 20 minutes as average delivery time. But now, I know where to place my pizza order. That is restaurant 2.
|Note: In this example, the central tendency is unable to deliver complete information. Therefore, you have to know the variability around the distribution’s middle to get a clear answer to your question.|
How can I get the best measures of variability?
Well, to get the best variability, you need to check the distribution and level of measurements. And what are they both?
Level of measurements
To get the ordinal level of measured data, the IQR (Interquartile Range) and the range (that have been discussed below) are the only factors of measures of variability that need to be considered.
But for complicated ratio and interval level, the variance and standard deviation (SD) consider.
Remember that all the measures use for normal distribution. But the variance and SD still prefer to take the complete data set into account. But it has been seen that variance and SD can easily influence by the outliers.
The IQR is the best measure for skewed distribution. IQR concentrates over the spread in the middle data set. That is why it is least affected by any of the extreme values.
|A quick recap|
Variability is also termed as scatter, spread, or dispersion.
Interquartile range (IQR) is the range of a distribution’s middle half.
Variance is the mean of squared distance from the average.
The range is the highest value minus the lowest value.
Standard deviation is the mean distance from the average.
It has been seen that measures of variability lie in almost every aspect of life. And there are four measures that a statistician needs to consider.
And these are Range, IQR, SD, and Variance. We have detailed all the useful points that help you to understand the concept of variability.
Hope you like these details that support you in the long run. Apart from this, if you have any doubt related to measures of variability, you are most welcome to ask your query.
Comment your doubts and get the best solutions in the best possible way.
“Stay motivated to learn new things daily with Statanalytica blogs.”
Frequently Asked Questions
It is noticeable that standard deviation utilizes the original data units that help in the interpretation of data. That is why it is not irrelevant to say that SD is the most used measure of variability.
The variance and standard deviation are the most valuable measures of variation in psychology statistics.
Variability is the degree to which the dataset points can diverge from the mean value. Moreover, it is the degree to which the dataset points can vary from one another.