Does the axis have to start at zero? (Part 2 – bar charts)

I recently wrote a blog post about axes on line charts starting at non-zero values. However, bar charts are a bit of a different beast as people are inclined to rely on the length of a bar when making comparisons instead of the scale of the axis. 

I’ve been keeping my eye out for a real-life UK example to discuss in this post for a couple of weeks now and despite much searching I’ve been disappointed that I couldn’t find one. I’m hoping that this is because people’s knowledge of data vis best practice is improving but I’ve got to admit I’m not wholly convinced as there are plenty of other dubious-looking charts about.

Instead I’ll revert to a Fox News example as they’ve been repeatedly accused of misleading viewers with charts. This extremely skewed bar chart for instance appears to show that 6 million is around a third of 7 million. It’s particularly bad as the y-axis isn’t even labeled. You intuitively compare the length of the bars assuming the base is zero. However, as the y-axis labels are missing only the most savvy will notice the data is being misrepresented.

obamacareenrollment-fncchart.jpg

Misusing charts in this way isn’t ideal if you want people to trust you and your data.

So what are your options?

Earlier in the year I was working on an article looking at how personal well-being changes by ageHappiness was assessed by asking people “Overall, how happy did you feel yesterday?” They then responded on a scale of 0 to 10, where 0 was “not at all” and 10 was “completely”.

Responses to how happy people were by age varied relatively little. The data was all between 7.1 and 7.8 so plotting it on a bar chart with an axis demonstrating the full range of answers gives the following…

Average happiness rating by age group, 2012 to 2015, UK

Screen Shot 2016-06-13 at 16.38.00.png

Wow, that’s a lot of orange. Any patterns in the data just seem to be dominated the fact that there are lots of bars. So what can we do to focus more on the pattern in the data? 

Is it a good idea to scale the axis to look at just the area of interest – maybe between 7 and 8?

Average happiness rating by age group, 2012 to 2015, UK

Screen Shot 2016-06-13 at 16.38.09.png

Here’s the re-scaled bar chart, however the bars now misrepresent the data and you really have to rely on a reader looking at the axis labels to avoid coming to an incorrect conclusion.  To demonstrate, let’s take a look at the 16 to 19 and 20 to 24 age groups. It’s natural for people to compare lengths so looking at these two bars you might think that 16 to 19 year olds were almost twice as happy as 20 to 14 year olds, when actually they’re pretty similar – only a difference of 0.27.

Screen Shot 2016-06-13 at 16.37.22.png

So what’s the solution? One way of managing this would be to move away from using bars entirely, for example, using the position of a dot to mark each data point. The length aspect no longer exists and the reader is forced to look at the position of the data point relative to the x-axis in order to make a comparison between categories.

Average happiness rating by age group, 2012 to 2015, UK

Screen Shot 2016-06-10 at 14.20.24.png

Quite an improvement I think. Something else that can help ensure that people realise the axis does not start at zero is animation. The GIF below demonstrates an example of how this can work, but I prefer the dot plot option – much easier to do, no gratuitous animations and if someone prints the chart or takes a screenshot you can predict what they can see.

Average happiness rating by age group, 2012 to 2015, UK

happiness

This discussion on axes starting at a non-zero value is one of those timeless data vis discussions so there’s lots to read on the subject if it interests you.

Here are a few to get you started:

It’s OK not to start your y-axis at zero

Baseline for amount scale

When It’s OK to NOT Start Your Axis at Zero 

But the moral of this story is – bar charts should always include zero.

And another thing….

On a similar subject of bars misrepresenting data here’s one of my favourite examples of dodgy scaling that I spotted in Newport when I was getting the train back from one of the Intro to Data Vis courses.

Screen Shot 2016-07-18 at 15.11.40

On first glance you might think that it’s a bar chart and the bars are representing the data, but no, the size of the bars is completely unrelated to the values.

Here’s a version with bars the correct size overlaid…

Screen-Shot-2016-07-18-at-15.11.40

The UK doesn’t look quite so good when compared to Germany, France and Italy now.

So always make sure your bars correctly represent the data. In fact that’s good advice for any chart type, not just bar charts. Whether you’re making a pie chart, stacked bar chart or treemap…make sure your visual elements are scaled according to the values they correspond to!

 

One comment on “Does the axis have to start at zero? (Part 2 – bar charts)”

Comments are closed.