Back to all posts

When you want to see an average for a particular metric, you may be tempted to calculate the sum of all the elements and then divide it by their quantity. Technically, it will be the correct mean value. But in some cases, it may be seriously misleading.

## What's wrong with mean values?

Here is an illustration from a great book, "The Naked Statistics" by Charles Wheelan:

"Imagine that ten guys are sitting on bar stools in a middle-class drinking establishment in Seattle; each of these guys earns $35,000 a year, which makes the mean annual income for the group $35,000. Bill Gates walks into this bar... Let's assume that Bill Gates has an annual income of $1 billion.When Bill sits down on the eleventh bar stool, the mean annual income for the bar patrons rises to about $91 million."

It illustrates the effect of outliers when one (or maybe a few) element has a substantially higher or lower value compared to the majority of others. And the mean value will not be representative.

## Use median instead

It is another statistical term that may be more useful. And it is also easy to calculate.

Sort all data elements by their values (ascending or descending order does not matter). Then find the element located precisely in the middle of this list. Its value will be the median.

You may interpret this value as "for half of our clients, the value of this indicator falls below this value," which may be more practical in many cases.