Use your y-axis
…to its full potential
Last year we launched a data visualisation best practice section on Style.ONS.gov.uk. In an effort to keep it short, sweet and accessible we didn’t go into the reasoning behind the guidance – leaving that fuller understanding for the “introduction to data visualisation” course we run across the GSS. However, not everyone can make it to these courses, and so we’re going to share some of this thinking in a series of blogposts over the coming year. We hope you’ll find this useful in your own work.
We’ve got a number of topics lined up, but if you’ve got anything you’d like us to cover then leave a comment below or contact me @fryford, @HartlandZoe or any of the team and we’ll see if we can weave it in.
First up the Y-axis (and scaling more generally). I’ve lost count of the number of times we’ve had conversations about whether you can start a y-axis at a non-zero? [YES YOU CAN…but be careful – we’ll go into this in a future post]. A quick Google reveals a few dozen posts on the subject as it’s one of those eternal data vis questions, along with “are pie charts OK?”
BUT – we should be thinking beyond this non-zero question – let’s really make sure we’re using the y-axis scaling to deliver our message, and think about where we start and end our y-axis.
Take this chart for example – can you see anything wrong with it?
Proportion of female MPs in the UK House of Commons, 1945-2015
I don’t think there’s anything obviously wrong here – around 1974 it looks a bit odd as the data I took had two data points for this year. Other than that it looks like a pretty normal to me, and it works pretty well if our objective was to illustrate the large rise in the number of female MPs, but does it illustrate the whole picture?
Well it all depends on how you look at it – what is the narrative that sits around this chart? Is there a policy target? From my quick research I can’t find an obvious one specifically for the UK, but across society there is an move towards equality, and so perhaps a more realistic scale might be to run the axis from 0 to 50% – 50% representing equality?
Proportion of female MPs in the UK House of Commons, 1945-2015
The use of white space sets the scene and annotation is crucial in explaining this to the reader. After a bit more googling it seems that there is a commonly agreed international aim led by the United Nations, so let’s add this to the chart as well.
Proportion of female MPs in the UK House of Commons, 1945-2015
Already this chart is a huge improvement on what we started with. The added annotation, the policy context and the change of scale should be assisting in delivering the messages of any accompanying text in this theoretical analysis. In fact, it should also help reduce the volume of text needed to convey the message, the chart may well do most of the work. It delivers something that is very self contained – in a world where everyone is increasingly bombarded with information and messages people need content that is quick and easy to consume. This type of image would work particularly well on social media.
Not without challenge
Of course, using the y-axis to it’s full extent isn’t without its challenges, particularly in an independent statistical office where we strive to remain neutral. Making the decision as to what to present can be difficult – everyone consuming this will have their own point of view on what is objective. Tim Harford eloquently describes in this challenge in a recent article titled “Delusions of objectivity”. Graphically, I’ve see no finer example than this interactive illustration from the New York Times on diverging perspectives of the democrat and republican parties.
This challenge shouldn’t deter us though – we need to focus on the careful use of language, reference to widely recognised events, policy changes and targets that are relevant to the analysis.
Why use a line chart at all?
You might ask why are we using a line chart at all? It would be a perfectly valid question to ask – it’s easy to forget that the proportion of male MPs is the reverse of this picture. We could add another line for this, but now we’re interested in both proportions we’ve switched our focus to a part-to-whole relationship – so lets opt for a visual that firmly conveys this concept.
It gets slightly awkward now as the time interval isn’t equal throughout and so we don’t want to exaggerate or underplay any trends, but the interval is most erratic during the early 20th century and that was when the proportion of female MPs was broadly stable, so I won’t worry about this too much in this example
The proportion of female MPs has risen, but remains below the UN target
Proportion of female MPs in the UK House of Commons, 1945-2015
So you can see we’ve used a stacked bar chart here – scaling the axis from 0 to 100. We’ve used a stronger colour to highlight the category of main interest here, and set back the other category. It shows the full picture and the other side of the story that male MPs have formed the majority of the House of Commons for some time.
You might have noticed my final tweak here – to add a more engaging and meaningful caption to the chart. Of course we should retain the more statistical title, explaining the measures and coverage but the caption helps prime the reader for what they are going to see, reducing the cognitive burden. It should make the chart that bit easier to interpret at a glance. It’s also a good exercise in summation – how would you have explained this chart in 140 characters or less for Twitter? What alt text would you have used as an accessibility fallback?
Join us soon for our next thrilling instalment…when is OK not to start the y-axis at zero?
9 comments on “Use your y-axis”
Comments are closed.
A good article. There is, however, one aspect that’s troubling me slightly – specifically the highly discrete nature of the data – I say “highly discrete” because although it is possible for discrete data points to reflect something which is changing continuously, this clearly doesn’t apply to this series which (apart from minor changes at by-elections) represents a variable which is static until the next time point. This raises the question as to whether using a line chart is correct as people not familiar with the system might wrongly think, for example, that the proportion of female MPs in, say, 2012, was 25% (rather than 22%). However, using a bar chart, although theoretically better for discrete values, gives a less linear representation of time because, as you indicate, general elections haven’t been spaced evenly, and in 1974 there were two!
Potentially you could ameliorate this with a footnote, and your x-axis labelling could show just the years of the elections rather than 5 year time points. But if you’re using a line chart I think the biggest potential improvement would be if you had a “stepped line” such that the proportion changed only at the date of elections, and then stayed level until the next election (although better still would be if by-elections were also taken into account).
Great summary of how to present data. Re accessibility – I consider colour blindness and other visual disabilities, and what it might look like on a projector.
Thanks Andrew, I don’t disagree with anything you’ve said there – a stepped line may well have been a better starting point. I’ve used marker points to indicate where there are data points, and it’s generally a subject that is well understood, so I hope I haven’t confused many people. The main focus here was to show people how a change in the y-scale can tell a very different story – which I hope I’ve done!
Thanks Tansy – all good things to consider from an accessibility perspective. We’ll cover the accessibility of colour in a future post.
Hi Rob – all very interesting and praiseworthy but ONS needs to get the basics right first. You’re showing line charts which have been rotated by 20 degrees which is basically challenging the reader not to misinterpret what the chart is showing. You’ve also got a chart of population estimates over past 30 years on your home page which is a complete waste of space because annual growth rates are low enough that showing total population without a truncated y axis effectively hides any story in the data. Think you need to do your homework before setting yourselves up as experts on good practice here.
Regards
Jon