Skip to main content

Measure Variance

Learning Objectives

After completing this unit, you’ll be able to:

  • Define variance and standard deviation.
  • Calculate mean, variance, and standard deviation.

When you’re looking at the distribution of your data, do you have data that is spread out? What can the spread tell you about the data, and what conclusions can you draw? In this module, you gain familiarity with the concepts of variation and making informed, or wise, comparisons, which can help you to explore, understand, and communicate with data. 

Variance and Standard Deviation

The Data Distributions module introduces the shape (symmetrical or skew) and the center (mean or median) of the data. 

Now we will look at the variance, or spread, of the data. Variance measures how data points vary from the mean, whereas standard deviation is the measure of the distribution of statistical data. Let’s consider an example.

Two groups of students took quizzes worth 10 points each. Both groups saw mean quiz scores of 7, or 70%. However, group A’s quiz scores range from 5 to 9 (50% to 90%), while group B's quiz scores range from 4 to 10 (40% to 100%). The scores for group B are more spread out than group A.

We want to better understand the spread of the data. To do this, we measure the variance and standard deviation using the following steps.

  • Verify the mean. When looking at the data, we see that each group has 20 quiz takers. If we calculate the sum of all the scores for each group, we get a total of 140 for both Group A and Group B.
Group A Quiz Scores Group B Quiz Scores

9

10

9

10

9

10

8

9

8

9

8

9

8

8

7

8

7

7

7

7

7

7

7 6

6

6

6

6

6

5

6

5

6

5

6

5

5

4

5

4

To calculate the mean, we divide the total for each group by the number of quiz takers in the group. For each group, the equation is 140/20, and the mean score for each group is 7 (or 70%).
Group A: 
9 + 9 + 9 + 8 + 8 + 8 + 8 + 7 + 7 + 7 + 7 + 7 + 6 + 6 + 6 + 6 + 6 + 6 + 5 + 5 = 140
140/20 = 7

Group B:
10 + 10 + 10 + 9 + 9 + 9 + 8 + 8 + 7 + 7 + 7 + 6 + 6 + 6 + 5 + 5 + 5 + 5 + 4 + 4 = 140
140/20 = 7

  • Begin calculating the variance by finding the differences.

Now that we’ve calculated the mean, we can begin to calculate the variance. Variance measures how spread out the data is. A variance of zero indicates that all of the data values are identical. A high variance indicates that the data points are very spread out from both the mean and from one another.

Group A Quiz Scores Difference from mean (7, or 70%) Group B Quiz Scores Difference from mean (7, or 70%)

9

2

10

3

9

2

10

3

9

2

10

3

8

1

9

2

8

1

9

2

8

1

9

2

8

1

8

1

7

0

8

1

7

0

7

0

7

0

7

0

7

0

7

0

7

0

6

-1

6

-1

6

-1

6

-1

6

-1

6

-1

5

-2

6

-1

5

-2

6

-1

5

-2

6

-1

5

-2

5

-2

4

-3

5

-2

4

-3

To calculate the variance, add the squared distances of each data point from the mean and then divide by the number of data points.

To begin, let’s calculate the difference from the mean score of 7 for each quiz taker. For example, the difference between 9 and 7 is 2 (since 9 - 7 = 2) and the difference between 6 and 7 is -1 (since 6 - 7 = -1).

  • Continue calculating the variance by squaring the differences.

We’ve calculated the difference from the mean for each quiz taker. Now, let’s square each difference. For example, the difference between 9 and 7 is 2 (9 - 7 = 2), and the square of 2 is 4 (since 2 * 2 = 4). The difference between 6 and 7 is -1 (since 6 - 7 = -1), and the square of -1 is 1 (since -1 * -1 = 1).

Group A Quiz Scores Difference from mean (7, or 70%) Squared value of difference from mean Group B Quiz Scores Difference from mean (7, or 70%) Squared value of difference from mean

9

2

4

10

3

9

9

2

4

10

3

9

9

2

4

10

3

9

8

1

1

9

2

4

8

1

1

9

2

4

8

1

1

9

2

4

8

1

1

8

1

1

7

0

0

8

1

1

7

0

0

7

0

0

7

0

0

7

0

0

7

0

0

7

0

0

7

0

0

6

-1

1

6

-1

1

6

-1

1

6

-1

1

6

-1

1

6

-1

1

5

-2

4

6

-1

1

5

-2

4

6

-1

1

5

-2

4

6

-1

1

5

-2

4

5

-2

4

4

-3

9

5

-2

4

4

-3

9

  • Continue calculating the variance by summing the differences.

We’ve calculated the difference from the mean for each quiz taker, and we’ve squared each difference. Now, we sum the squared differences for each group:

Group A: 

4 + 4 + 4 + 1 + 1 + 1 + 1 + 0 + 0 + 0 + 0 + 0 + 1 + 1 + 1 + 1 + 1 + 1 + 4 + 4 = 30

Group B:

9 + 9 + 9 + 4 + 4 + 4 + 1 + 1 + 0 + 0 + 0 + 1 + 1 + 1 + 4 + 4 + 4 + 4 + 9 + 9 = 78

  • Finish calculating the variance by averaging the summed differences.

To find the variance, we now divide the summed squares for each group by the total number of data points (quiz takers) in the group, or 20. 

The variance for Group A is 1.5, and the variance for Group B is 3.9.

Group A: 

4 + 4 + 4 + 1 + 1 + 1 + 1 + 0 + 0 + 0 + 0 + 0 + 1 + 1 + 1 + 1 + 1 + 1 + 4 + 4 = 30

30/20 = 1.5

Group B:

9 + 9 + 9 + 4 + 4 + 4 + 1 + 1 + 0 + 0 + 0 + 1 + 1 + 1 + 4 + 4 + 4 + 4 + 9 + 9 = 78

78/20 = 3.9

  • Calculate the standard deviation.

The standard deviation measures the dispersion of a data set relative to its mean, and is calculated as the square root of the variance. If the data points are farther from the mean, there is a higher deviation within the data set. In other words, the more spread out the data, the higher the standard deviation.

We’ve calculated the variance for each group. To find the standard deviation for each group, we calculate the square root of the variance. 

The standard deviation for Group A is 1.22, and the standard deviation for Group B is 1.97.

Group A: 

Variance = 1.5

Square root of 1.5 = 1.22

Group B:

Variance = 3.9

Square root of 3.9 = 1.97

  • Revisit the data.

We can now show which quiz takers’ scores are within one standard deviation of the mean for each group. (The difference from the mean can be positive or negative.)

Group A Quiz Scores Difference from mean (7, or 70%) Squared value of difference from mean Within 1 standard deviation from mean (1.22)? Group B Quiz Scores Difference from mean (7, or 70%) Squared value of difference from mean Within 1 standard deviation from mean (1.97)?

9

2

4

No

10

3

9

No

9

2

4

No

10

3

9

No

9

2

4

No

10

3

9

No

8

1

1

Yes

9

2

4

No

8

1

1

Yes

9

2

4

No

8

1

1

Yes

9

2

4

No

8

1

1

Yes

8

1

1

Yes

7

0

0

Yes

8

1

1

Yes

7

0

0

Yes

7

0

0

Yes

7

0

0

Yes

7

0

0

Yes

7

0

0

Yes

7

0

0

Yes

7

0

0

Yes

6

-1

1

Yes

6

-1

1

Yes

6

-1

1

Yes

6

-1

1

Yes

6

-1

1

Yes

6

-1

1

Yes

5

-2

4

No

6

-1

1

Yes

5

-2

4

No

6

-1

1

Yes

5

-2

4

No

6

-1

1

Yes

5

-2

4

No

5

-2

4

No

4

-3

9

No

5

-2

4

No

4

-3

9

No

You’ve now seen the process for calculating variance and standard deviation. Later in this unit, you'll have the chance to perform these calculations in a simple scenario.

Sample Variance

What should you do if you don’t have data for the whole population?

There is a difference in the calculation of variance for a population and for a sample, or subset, of a population. For both, you calculate the mean, then the differences from the mean, square all the differences, and then sum the squared differences.

As in the previous example, when calculating population variance, divide the sum of squared deviations from the mean by the number of items in the population. In a full population of 20, for example, we divide by 20. 

Now here’s the difference. When calculating sample variance, divide the sum of squared deviations from the mean by the number of items in the sample minus one. In this case, if you had 20 items in a sample (or subset) of the population, divide by 19. The purpose of this difference is to get a less biased estimate of the population’s variance. In other words, dividing by the sample size minus one (n-1) compensates for working with a sample rather than the whole population. The small n represents the number of observations in a sample. The equation n - 1

Example: Calculate the Variance and Standard Deviation

Now, follow along to determine the variance and the standard deviation using an example with fewer numbers.

Imagine that you have five cats in your household, Cinnamon, The Amazing Fluffy, Lilypad, Danielle, and Steve.Five cats, photographed from the back, looking out a window

To keep things straightforward, let’s consider the cats in your home a complete population rather than a sample. You weigh each of the cats, and record the results as represented in the following table.

Cat’s name Weight in Pounds

Cinnamon

7

Danielle

8

Lilypad

9

Steve

12

The Amazing Fluffy

14

First, calculate the mean (or average) weight for the five cats.

  1. Add all the weights together:  
    7 + 8 + 9 + 12 + 14 = 50
  2. Then divide that total by the number of cats in the data:
    50/5 = 10  
    10 pounds is the mean weight for this group of cats.
    Now, begin to calculate the variance.
  3. First, calculate each cat’s difference from the mean weight:

    Cat’s name Weight (in pounds)

    Difference from mean 

    (10 pounds)

    Cinnamon

    7

    7 - 10 = (-3)

    Danielle

    8

    8 - 10 = (-2)

    Lilypad

    9

    9 - 10 = (-1)

    Steve

    12

    12 - 10 = 2

    The Amazing Fluffy

    14

    14 - 10 = 4


  4. Now, square each difference from the mean.

    Cat’s name Weight (in pounds)

    Difference from mean 

    (10 pounds)


    Squared value of difference from mean

    Cinnamon

    7

    (-3)

    (-3) * (-3) = 9

    Danielle

    8

    (-2)

    (-2) * (-2) = 4

    Lilypad

    9

    (-1)

    (-1) * (-1) = 1

    Steve

    12

    2

    2 * 2 = 4

    The Amazing Fluffy

    14

    4

    4 * 4 = 16


  5. Next, add all the squared values of the differences from the mean together:
    9 + 4 + 1 + 4 + 16 = 34

  6. Then, divide the result by the number of data points (or cats):
    34/5 = 6.8. So, 6.8 is the variance for the cats.

  7. Now that you have calculated the variance, calculate the standard deviation by finding the square root of the variance. (You can use a calculator to do this.)
    The square root of 6.8 is 2.6. So, 2.6 is the standard deviation.
    You can now see which cats’ weights are within one standard deviation (2.6 pounds) of the mean (10 pounds):
Cat’s name Weight (in pounds)

Difference from mean 

(10 pounds)


Within one standard deviation (2.6 pounds)?

Cinnamon

7

(-3)

No

Danielle

8

(-2)

Yes

Lilypad

9

(-1)

Yes

Steve

12

2

Yes

The Amazing Fluffy

14

4

No

Resources

Share your Trailhead feedback over on Salesforce Help.

We'd love to hear about your experience with Trailhead - you can now access the new feedback form anytime from the Salesforce Help site.

Learn More Continue to Share Feedback