Guide: Mean, median and mode

Confused by the difference between the mean, the median and the mode? This guide gives the answers

Mean, median and mode are different ways to summarise lists of numbers.

Philip B Stark, professor of statistics at the University of California, Berkeley in the US, explains them as attempts at capturing “with a single number” what is typical of “an entire list of numbers”. 

If, as Google autocomplete suggests, you’re one of the people wondering how to use mean, median and mode “in real life”, this guide is for you.

Mean

You’re likely already familiar with the mean. In everyday language it’s called the average. 

The mean is “the sum of all the values divided by the total number of values”, says a Statistics South Africa guide on data handling and probability.

To calculate the average of two numbers, add them up and divide them by two. To get the average of 100 values, add them up and divide them by 100, and so on.

For example: 

sum of values ÷ number of values = mean

(1 + 5 + 13 + 34 + 47) ÷ 5 = 20

The mean is useful when you “primarily care about the total”, says Stark. He gives the example of the mean income of a family, which reveals what they can spend on each family member’s living costs.

But the mean can be “very sensitive to extreme values”, writes Alberto Cairo in The Truthful Art: Data, Charts, and Maps for Communication.

Say you’re analysing the starting salaries of a particular university’s graduates. An unusually high salary – in this example, former basketball player Michael Jordan’s – could distort the picture, says Cairo.

Michael Jordan’s salary is an outlier, a value that is so far from the norm … that it twists our understanding of the data if we aren’t careful enough.” 

Median

This is where another measure – the median – comes in. The median doesn’t have the same limitation – it is “resistant”. 

“Even if we added one outrageous value at the lower end of the series of values – say, 10 bucks a year – or increased Michael Jordan’s annual salary to 100 billion dollars, the median would remain untouched, while the mean would go bananas,” says Cairo.

So what is the median?

It is “the middle value when all values are placed in ascending or descending order”.

If the numbers are arranged from lowest to highest or highest to lowest, half would be above the median and half below it, according to a South African journal article on statistical terms.

In this example, the median is 13. It’s the middle number:

1 5 13 34  47

If you have a big list with an odd number of values, you can use this method to determine which value (the how-many-th value) the median would be: Add one to the number of values in your list and divide the result by two.

If we apply this formula to the above example, it would be:

(5 + 1) ÷ 2 = 3

This means the third value in the list is the median. In the example, it’s 13.

What happens if there is an even number of values? Then the median is the average of the two middle numbers. In the example below, the median is nine:

1 5 13 34

(5 + 13) ÷ 2 = 9

Note that you must order the numbers first, or your result for the median will be incorrect.

One benefit of the median is that it’s not distorted by outliers, such as Michael Jordan’s salary above. A limitation of the median is that it doesn’t use all the information in a data set – it focuses on the middle value.

So when outliers aren’t an issue, the mean could be a better summary of the data “as it includes information from every observation [number in the list], rather than just the middle value”.

Not sure which of the two to use? 

Data journalist Anthony DeBarros shares this tip: “A good test: calculate the average [mean] and the median for a group of values. If they’re close, then the group is probably normally distributed (the familiar bell curve), and the average is useful. If they’re far apart, then the values are not normally distributed and the median is the better representation.”

(Note: If you’d like to further explore the difference between mean and median, here is an example of these measures applied to income inequality.)

Mode

The mode is the number that occurs most frequently in a series of numbers. If every number appears only once, there is no mode. There can also be more than one mode, for example:

The mode is five:

1 5 5 13 34 47

There is no mode:

1 5 13 34 47

There is more than one mode – five and 34:

1 5 5 13 34 34 47

“Mode is simple to locate and is preferred for finding the most popular item e.g. most popular drink or the most common size of shoes,” according to the International Encyclopedia of Statistical Science.

An advantage of the mode is that it isn’t limited to numbers. It can also apply to categorical data, such as the colours of cars in a parking lot.

 

Useful resources

Additional reading

GUIDE: Tips to avoid three common statistical errors

GUIDE: How to get started with data journalism


 

© Copyright Africa Check 2020. Read our republishing guidelines. You may reproduce this piece or content from it for the purpose of reporting and/or discussing news and current events. This is subject to: Crediting Africa Check in the byline, keeping all hyperlinks to the sources used and adding this sentence at the end of your publication: “This report was written by Africa Check, a non-partisan fact-checking organisation. View the original piece on their website", with a link back to this page.