Basics of graph plotting

There are 2 features I want to draw attention to.

One is the strong seasonal behaviour and the second is that there are large daily fluctuations.

Line StyleLet’s start with the most fundamental, the actual data.

Don’t assume just because it is time series data that you need to use a line.

The main reason for using a line is so you can track the order of the data points, but we’re not really interested in the precise order, we just want to show that there are large fluctations.

Another problem with a line is that a lot of attention is drawn to outliers as they get more length of line associated with them.

Here’s my solution:We now have 2 different styles, one for the raw data and one for the trend.

This is really effective in this case because there are also 2 features we want to draw attention to in our plot.

The thick blue line very clearly shows the cyclical, seasonal nature of the temperature and is the dominant focus of the plot.

The daily fluctuations are shown by the scatter and are less overpowering in pale grey dots.

For the trend line I calculated the mean temperature for each month and then interpolated quadratically back in to a daily time series.

This produces a more aesthetically pleasing smooth plot.

Axis TextThere are some blatant things wrong with the axis labels and there are also some stylistic ones that I think can be improved on:I’ve made all the text larger and chose to change the colour to grey because I want the data to be the focus not the axis labels.

I also think using a serif font in python looks more professional.

The x axis: The x tick labels were too long and ended up overlapping.

One solution is to rotate the date but because I only have one tick per year I don’t need the day or month.

Given the context we can also infer that the x axis is referring to the date without having to label it explicitly.

The y axis: I’ve also subtly changed the limits so that there is equal space above the top tick and below the bottom tick.

Kudos to anyone who spotted that I also made the width of the axis ticks 2 rather than the standard of 1.

VERY pedantic but it’ll make sense further down the page.

LegendThere’s a few formatting options available for your legend.

The most important and the first you should always consider is do you need one?.In an ideal world your plot would be obvious without one but it’s not always avoidable, however it should be top of your list of things to consider.

In this case, I think it’s pretty obvious so I think adding a figure caption is the cleanest option.

Daily temperature measurements are shown by the grey dots with the 30 day mean shown in blue.

Where possible, position the legend in the figure because it keeps it nearby and you don’t have to go looking for it.

It also makes life a lot easier when transferring the figure into PowerPoint; a floating legend can make the image dimensions very wide and doesn’t leave much room for text on your slides.

However, also don’t position the legend over a really important part of the graph (yes, I’ve seen it many times!!)Bounding BoxI’m talking about the black rectangle that encompasses the plot and includes the x and y axes.

To me, it shuts off the visualisation from the text and, therefore, the narrative and that’s what it’ll do to your viewers.

I’m quite a new adopter of this but you can make them invisible!Daily temperature measurements are shown by the grey dots with the 30 day mean shown in blue.

I’ve also added horizontal lines to mark the major ticks on the y axis so you can read the actual values easier.

Again, these are very pale and plotted first so that they don’t overpower the eye but they’re there when you need them.

Remember I changed the width of the little axis ticks earlier??.They’re now left out on their own without the main x and y axes so it makes a little more sense to make them more dominant rather than be mistaken for dust on the screen.

Conclusions5 minutes of time working out how to perfect a plot are worth 5 hours of discussion about what it means.

CodeThe code to plot the above perfected plot is given in this github repo.

.

. More details

Leave a Reply