Mean, median and mode are different ways to summarise lists of numbers.
Philip B Stark, professor of statistics at the University of California, Berkeley in the US, explains them as attempts at capturing “with a single number” what is typical of “an entire list of numbers”.
You’re likely already familiar with the mean. In everyday language it’s called the average.
To calculate the average of two numbers, add them up and divide them by two. To get the average of 100 values, add them up and divide them by 100, and so on.
sum of values ÷ number of values = mean
(1 + 5 + 13 + 34 + 47) ÷ 5 = 20
The mean is useful when you “primarily care about the total”, says Stark. He gives the example of the mean income of a family, which reveals what they can spend on each family member’s living costs.
But the mean can be “very sensitive to extreme values”, writes Alberto Cairo in The Truthful Art: Data, Charts, and Maps for Communication.
Say you’re analysing the starting salaries of a particular university’s graduates. An unusually high salary – in this example, former basketball player Michael Jordan’s – could distort the picture, says Cairo.
“Michael Jordan’s salary is an outlier, a value that is so far from the norm … that it twists our understanding of the data if we aren’t careful enough.”
This is where another measure – the median – comes in. The median doesn’t have the same limitation – it is “resistant”.
“Even if we added one outrageous value at the lower end of the series of values – say, 10 bucks a year – or increased Michael Jordan’s annual salary to 100 billion dollars, the median would remain untouched, while the mean would go bananas,” says Cairo.
So what is the median?
If the numbers are arranged from lowest to highest or highest to lowest, half would be above the median and half below it, according to a South African journal article on statistical terms.
In this example, the median is 13. It’s the middle number:
1 5 13 34 47
If you have a big list with an odd number of values, you can use this method to determine which value (the how-many-th value) the median would be: Add one to the number of values in your list and divide the result by two.
If we apply this formula to the above example, it would be:
(5 + 1) ÷ 2 = 3
This means the third value in the list is the median. In the example, it’s 13.
What happens if there is an even number of values? Then the median is the average of the two middle numbers. In the example below, the median is nine:
1 5 13 34
(5 + 13) ÷ 2 = 9
Note that you must order the numbers first, or your result for the median will be incorrect.
One benefit of the median is that it’s not distorted by outliers, such as Michael Jordan’s salary above. A limitation of the median is that it doesn’t use all the information in a data set – it focuses on the middle value.
So when outliers aren’t an issue, the mean could be a better summary of the data “as it includes information from every observation [number in the list], rather than just the middle value”.
Not sure which of the two to use?
Data journalist Anthony DeBarros shares this tip: “A good test: calculate the average [mean] and the median for a group of values. If they’re close, then the group is probably normally distributed (the familiar bell curve), and the average is useful. If they’re far apart, then the values are not normally distributed and the median is the better representation.”
(Note: If you’d like to further explore the difference between mean and median, here is an example of these measures applied to income inequality.)
ModeThe mode is the number that occurs most frequently in a series of numbers. If every number appears only once, there is no mode. There can also be more than one mode, for example:
The mode is five:
1 5 5 13 34 47
There is no mode:
1 5 13 34 47
There is more than one mode – five and 34:
1 5 5 13 34 34 47
“Mode is simple to locate and is preferred for finding the most popular item e.g. most popular drink or the most common size of shoes,” according to the International Encyclopedia of Statistical Science.